gaussianNB(Y, X, [varSmoothing=1e-9])
Y is a vector with the same length as table X. Each element of labels indicates the class that the correponding row in X belongs to.
X is a table indicating the training set. Each row is a sample and each column is a feature.
varSmoothing is a positive floating number indicating the portion of the largest variance of all features that is added to variances for calculation stability.
Conduct the Naive Bayesian classification. Return a dictionary with the following keys:
model: a RESOURCE data type variable. It is an internal binary resource generated by function
gaussianNBand to be used by function predict.
modelName: string “GaussianNB”.
varSmoothing: varSmoothing parameter value.
The dataset iris.data used in the following example can be downloaded from https://archive.ics.uci.edu/ml/datasets/iris.
$ DATA_DIR = "C:/DolphinDB/Data" $ t = loadText(DATA_DIR+"/iris.data") $ t.rename!(`col0`col1`col2`col3`col4, `sepalLength`sepalWidth`petalLength`petalWidth`class) $ t[`classType] = take(0, t.size()) $ update t set classType = 1 where class = "Iris-versicolor" $ update t set classType = 2 where class = "Iris-virginica" $ training = select sepalLength, sepalWidth, petalLength, petalWidth from t $ labels = t.classType $ model = gaussianNB(labels, training); $ predict(model, training);