API Reference


This module includes one class, DefectPredictor, representing a defect predictor.

class radondp.predictors.DefectPredictor()

Class representing a defect predictor. It contains the logic to train a model, save and load a model from the disk, use that model to predict unseen instances.


Init the DefectPredictor

balancers() -> list

Return the list of instances used to balance the train data.

balancers(balancers:List[str]) -> None

Set the balancers to train the model.

  Parameters:  balancers(List[str]) - a list of balancers (e.g., [none, rus, ros])
  Raise:  ValueError - if one or more balancers are not in [none, rus, ros]

normalizers() -> list

Return the list of instances used to normalize train and test data.

normalizers(normalizers:List[str]) -> None

Set the normalizers to scale data.

  Parameters:  normalizers(List[str]) - a list of normalizers (e.g., [none, minmax, std])
  Raise:  ValueError - if one or more normalizers are not in [none, minmax, std]


Return the list of instances used to train the model classifier.


Set the balancers to train the model.

  Parameters:  classifiers(List[str]) - a list of classifiers (e.g., [dt, logit, nb, rf, svm])
  Raise:  ValueError - if one or more normalizers are not in [dt, logit, nb, rf, svm]

train(data:pandas.DataFrame) -> imblearn.pipeline.Pipeline Train a new model

  Parameters:  data(pandas.DataFrame) - the train data consisting of metrics and metadata about clean and failure_prone scripts
  Return: the best fitted estimator, that is, the one that maximizes the average_precision
  Raise:  Fail - if columns failure_prone, commit, committed_at, filepath are not in data

predict(unseed_data:pandas.DataFrame) -> bool Predict an unseen instance as failure-prone or clean.

  Parameters:  data(pandas.DataFrame) - the unseen data consisting of the observations to predict
  Return: True if failure-prone; False otherwise
  Raise:  Exception - if no model has been loaded.

load_model(path_to_model_dir: str) -> None Load a model from the disk.

  Parameters:  path_to_model_dir(str) - the path to the directory containing model-related files

dump_model(path_to_model_dir: str) -> None Dump the trained model to the disk.

  Parameters:  path_to_model_dir(str) - the path to the directory where to save model-related files