Bayes Classifier or Naive Bayes Classifier
BayesClassifier.Rd
Train a BayesClassifier object
Arguments
- formula
a formula object explaining the functional relation between input variables and target
- data
a data.frame containing the dataset
- naive
whether a Bayes or Naive Bayes classifier should be used
- prior
type of class prior; either "uniform" (all classes are equally weighted) or "proportional" (all classes are weighted proportionally to their numbers of samples in the training data)
- var_eps
scalar to add to the main diagonal of the covariance matrix to assure numerical stability. If 0, no scalar will be added.
Value
an S3 object of class ´BayesClassifier´, which has the following internal structure:
´param´: a list of model parameters; each list element represents one class, and contains a vector mu (class mean), a matrix Sigma (class covariance), and a scalar prior (prior probability)
´prior´: the type of prior model used to call the BayesClassifier
´formula´: the formula used to call the BayesClassifier
´naive´: whether the model contains a Naive Bayes or a full Bayes classifier
´n´: the number of samples during training
´all.features´: the vector of all feature names in the training data
´logLik´: the value of the log-likelihood
Details
The function trains a Bayes classifier on a given classification dataset, specified by a data.frame ´data´ and a formula ´formula´. The specified target variable must be a factor. The function estimates class-wise mean values (´mu´), covariance matrices (´Sigma´) and prior probabilities (´prior´) to represent classes. If a Naive Bayes classifier is selected, only a diagonal covariance matrix is estimated. Prior options indicate whether a uniform prior (all classes are equally weighted), or a proportional prior (all classes are weighted by their proportions in the training data) should be used.
Examples
set.seed(1)
dat <- data.frame(y = rep(c(1,2,3), each = 5),
x1 = as.vector(sapply(c(5,0,0), rnorm, n = 5)),
x2 = as.vector(sapply(c(0,0,5), rnorm, n = 5)))
mod <- BayesClassifier(y ~ ., dat, naive = TRUE)
summary(mod)
#> BayesClassifier model with 3 classes and 2 non-constant features
#> Note: the total number of features is 3
#> ==============================
#> formula: y ~ .
#> used features: x1, x2
#> parameters:
#>
#>
#> |mu |Sigma | prior|
#> |:---------|:-------------|-----:|
#> |5.13,0.46 |0.93,0,0,0.23 | 0.33|
#> |0.14,0.08 |0.46,0,0,1.45 | 0.33|
#> |0.04,4.65 |2.26,0,0,0.51 | 0.33|
predict(mod, newdata = expand.grid(x1 = c(0,5), x2 = c(0,5), y = NA), type = "class")
#> [1] "2" "1" "3" "3"