Title: | Multivariate Estimates and Tests |
---|---|
Description: | Multivariate estimation and testing, currently a package for testing parametric data. To deal with parametric data, various multivariate normality tests and outlier detection are performed and visualized using the 'ggplot2' package. Homogeneity tests for covariance matrices are also possible, as well as the Hotelling's T-square test and the multivariate analysis of variance test. We are exploring additional tests and visualization techniques, such as profile analysis and randomized complete block design, to be made available in the future and making them easily accessible to users. |
Authors: | Yeonseok Choi [aut, cre], Yong-Seok Choi [ctb] |
Maintainer: | Yeonseok Choi <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-11-04 04:34:03 UTC |
Source: | https://github.com/yeonseok-choi/mvet |
Mean Value Parallel Coordinates Plot (Use to HT2test & VManova)
.mean_parallel_plot(data, grp.name, scale = FALSE)
.mean_parallel_plot(data, grp.name, scale = FALSE)
data |
A numeric matrix or data frame. If data frame, group(class) column can be a factor or a string. |
grp.name |
The name of a column of |
scale |
If |
Mean Value Parallel Coordinates Plot
Performs Box's M-test for homogeneity of covariance matrices derived from multivariate normality data according to a single classification factor. This test is based on the chi-square approximation.
boxMtest(data, group)
boxMtest(data, group)
data |
A numeric matrix or data frame. |
group |
In either vector or factor form, the length of the group must correspond to the number of observations |
M.stat |
Box's M-test statistic approximates the chi-square distribution. |
df |
The degree of freedom is related to the test statistic. |
p.value |
The p-value of the test statistic. |
mardiatest
data(wine) class <- wine$class winedata <- subset(wine, select = -class) boxMtest(winedata, class)
data(wine) class <- wine$class winedata <- subset(wine, select = -class) boxMtest(winedata, class)
The mean vector test (Hotelling T square test) to compare one sample or two samples that satisfy the multivariate normality test and the homogeneity of covariance matrices test.
HT2test(data1, data2, mu0 = NULL, sample = "two", plot.scale = FALSE)
HT2test(data1, data2, mu0 = NULL, sample = "two", plot.scale = FALSE)
data1 |
The data frame or matrix must consist of only numbers, and the data must consist of only a single group or class. It should not contain columns that separate groups or classes. |
data2 |
The data frame or matrix must consist of only numbers, and the data must consist of only a single group or class. It should not contain columns that separate groups or classes. The |
mu0 |
The mu0 is used to test the mean vector hypothesis of |
sample |
The options for specifying the number of groups for group comparisons are |
plot.scale |
If |
One.HT2 |
The Hotelling T square test in one-sample, showing the degrees of freedom required for the F test, the Hotelling t square statistic, the F test statistic, and the probability of significance. |
Mean.val.plot |
Plot the mean value parallel coordinates, representing the two samples using the mean values for each variable. |
Two.HT2 |
The Hotelling T square test in two-sample, showing the degrees of freedom required for the F test, the Hotelling t square statistic, the F test statistic, and the probability of significance. |
Johnson, R. A., & Wichern, D. W. (2007). Applied Multivariate Statistical Analysis (6th ed.). Pearson Prentice Hall.
mardiatest for multivariate normality (Includes outlier remove)
PPCCtest for multivariate normality
SPCCtest for multivariate normality
boxMtest for homogeneity of covariance matrices
data(wine) class1.wine <- subset(wine, class == 1)[, -1] class2.wine <- subset(wine, class == 2)[, -1] modified.class2.wine <- outlier(class2.wine, lim = 0, level = 0.05, option = "all")$modified.data ## one sample value <- 0 p <- ncol(class1.wine) mu0 <- matrix(rep(value, p), nrow = p, ncol = 1) HT2test(data1 = class1.wine, mu0 = mu0, sample = "one") ## two sample HT2test(data1 = class1.wine, data2 = modified.class2.wine, sample = "two", plot.scale = TRUE)
data(wine) class1.wine <- subset(wine, class == 1)[, -1] class2.wine <- subset(wine, class == 2)[, -1] modified.class2.wine <- outlier(class2.wine, lim = 0, level = 0.05, option = "all")$modified.data ## one sample value <- 0 p <- ncol(class1.wine) mu0 <- matrix(rep(value, p), nrow = p, ncol = 1) HT2test(data1 = class1.wine, mu0 = mu0, sample = "one") ## two sample HT2test(data1 = class1.wine, data2 = modified.class2.wine, sample = "two", plot.scale = TRUE)
Performs a multivariate normality test by conducting a mardia test using skewness and kurtosis. If both skewness and kurtosis are satisfied, multivariate normality is satisfied.
mardiatest(data, level = 0.05, showplot = FALSE, showoutlier = FALSE, outlieropt = "all", shownewdata = FALSE)
mardiatest(data, level = 0.05, showplot = FALSE, showoutlier = FALSE, outlieropt = "all", shownewdata = FALSE)
data |
A numeric matrix or data frame. |
level |
The significance level of the skewness and kurtosis statistics. (default = |
showplot |
If |
showoutlier |
If |
outlieropt |
An |
shownewdata |
If |
mult.nomality |
Calculate statistics and p-values for skewness and kurtosis to ultimately determine whether multivariate normality is satisfied. |
QQPlot |
Shows Chi-Square Q-Q plot. |
... |
Same as the result of |
Mardia, K. V. (1970), Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519-530.
Mardia, K. V. (1974), Applications of Some Measures of Multivariate Skewness and Kurtosis in Testing Normality and Robustness Studies. Sankhya, 36, 115-128.
## Simple Mardia Test data(wine) class2.wine <- subset(wine, class == 2)[, -1] mardiatest(class2.wine, level = 0.05, showplot = TRUE) ## Mardia Test and Outlier Detection data(wine) class2.wine <- subset(wine, class == 2)[, -1] mardiatest(class2.wine, level = 0.05, showplot = TRUE, showoutlier = TRUE, outlieropt = "all", shownewdata = TRUE)
## Simple Mardia Test data(wine) class2.wine <- subset(wine, class == 2)[, -1] mardiatest(class2.wine, level = 0.05, showplot = TRUE) ## Mardia Test and Outlier Detection data(wine) class2.wine <- subset(wine, class == 2)[, -1] mardiatest(class2.wine, level = 0.05, showplot = TRUE, showoutlier = TRUE, outlieropt = "all", shownewdata = TRUE)
Using the mardia test, outliers are detected based on skewness and kurtosis. However, outliers don't detect more than half of the total observation data.(Can be modified with the lim
option.)
outlier(data, lim = 0, level = 0.05, option = "all")
outlier(data, lim = 0, level = 0.05, option = "all")
data |
A numeric matrix or data frame. |
lim |
The number of outliers detected can be limited. If 0 is entered, detection is possible up to half of the data. (default = |
level |
The significance level of the skewness and kurtosis statistics of the " |
option |
|
modified.data |
The modified data without outliers. |
modified.mvn |
The modified Mardia test result without outliers. |
outlier.num |
The position of outliers. |
outlier.cnt |
Total number of outliers. |
Jobson, J. D.(1992). Applied Multivariate Data Analysis, Springer-Verlag, New York.
data(wine) class2.wine <- subset(wine, class == 2)[, -1] outlier(class2.wine, lim = 0, level = 0.05, option = "all")
data(wine) class2.wine <- subset(wine, class == 2)[, -1] outlier(class2.wine, lim = 0, level = 0.05, option = "all")
The correlation coefficient of the quantiles and mahalanobis square are tested by using the critical value table by Filliben (1975) for the multivariate normality test.
PPCCtest(data, level = 0.05)
PPCCtest(data, level = 0.05)
data |
A numeric matrix or data frame. |
level |
At the |
data.cnt |
Observation |
PPCC.value |
Correlation coefficient value. |
critical.value |
Critical value proposed by Filliben (1975), corresponding to |
test.res |
Final result of multivariate normality. |
QQPlot |
Shows Chi-Square Q-Q plot. |
Filliben, J. J. (1975), The Probability Plot Correlation Coefficient Test for Normality, Technometrics 17, 111-117.
data(wine) class1.wine <- subset(wine, class == 1)[, -1] PPCCtest(class1.wine, level = 0.05)
data(wine) class1.wine <- subset(wine, class == 1)[, -1] PPCCtest(class1.wine, level = 0.05)
Using principal component analysis, the number of eigenvalues is selected such that the ratio of eigenvalues exceeds 70%. The principal component score vectors corresponding to these selected eigenvalues are used, and testing is conducted using the threshold defined by Filliben (1975). Users have the option to select the number of eigenvalues for the analysis based on their requirements.
SPCCtest(data, k = 0, level = 0.05)
SPCCtest(data, k = 0, level = 0.05)
data |
A numeric matrix or data frame. |
k |
The number of principal components can be manually selected. If 0 is entered, it automatically finds k components such that the explained variance ratio is at least 70%. (default = |
level |
At the |
Srivastava.QQplot |
Shows a chi-Square Q-Q plot for each PCs using ggplot2. |
data.cnt |
Observation |
explain.ratio |
Displays all explained variance ratios. |
critical.value |
Critical value proposed by Filliben (1975), corresponding to |
result |
Final result of multivariate normality. |
Srivastava, M. S. (1984), A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Statistics & Probability Letters, 2(5), 263-267.
Filliben, J. J. (1975), The Probability Plot Correlation Coefficient Test for Normality, Technometrics 17, 111-117.
data(wine) class1.wine <- subset(wine, class == 1)[, -1] SPCCtest(class1.wine, k = 5, level = 0.05)
data(wine) class1.wine <- subset(wine, class == 1)[, -1] SPCCtest(class1.wine, k = 5, level = 0.05)
Perform various types of multivariate analysis of variance (MANOVA) that satisfy tests of multivariate normality and homogeneity of covariance matrices.
VManova(data, grp1.name, grp2.name, way = "one", method = "all", plot.scale = FALSE)
VManova(data, grp1.name, grp2.name, way = "one", method = "all", plot.scale = FALSE)
data |
A numeric matrix or data frame. If data frames, group(class) column can be a factor or a string. |
grp1.name |
The name of the first group (or class) column in the input data, specified as a |
grp2.name |
The name of the second group (or class) column in the input data, specified as a |
way |
The type of MANOVA to perform (" |
method |
The method for MANOVA analysis. " |
plot.scale |
If |
Mean.val.plot |
Plot the mean value parallel coordinates, representing the two samples using the mean values for each variable. |
One.all |
Outputs the results of a one-way MANOVA test. It displays the degrees of freedom (Df1, Df2) of the F-distribution, statistics for Wilks, Lawley-Hotelling, Pillai, and Roy, the F-distribution test statistic, and the significance level in that order. |
Two.all |
Outputs the results of a two-way MANOVA test. It displays the degrees of freedom (Df1, Df2) of the F-distribution, statistics for Wilks, Lawley-Hotelling, Pillai, and Roy, the F-distribution test statistic, and the significance level in that order. |
Rencher, A. C., & Christensen, W. F. (2002). Methods of Multivariate Analysis. John Wiley & Sons, Inc., New York.
mardiatest for multivariate normality (Includes outlier remove)
PPCCtest for multivariate normality
SPCCtest for multivariate normality
boxMtest for homogeneity of covariance matrices
data(wine) ## one way VManova(wine, grp1.name = "class", way = "one", method = "all", plot.scale = TRUE) ## two way newwine <- wine # (1: low, 2: medium, 3: high) newwine$v4 <- ifelse(wine$v4 <= 17, 1, ifelse(wine$v4 <= 22, 2, 3)) VManova(newwine, grp1.name = "class", grp2.name = "v4", way = "two", method = "all", plot.scale = TRUE)
data(wine) ## one way VManova(wine, grp1.name = "class", way = "one", method = "all", plot.scale = TRUE) ## two way newwine <- wine # (1: low, 2: medium, 3: high) newwine$v4 <- ifelse(wine$v4 <= 17, 1, ifelse(wine$v4 <= 22, 2, 3)) VManova(newwine, grp1.name = "class", grp2.name = "v4", way = "two", method = "all", plot.scale = TRUE)
These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines.
wine
wine
A data frame with 178 observations on the following 14 variables:
The class vector, the three different cultivars of wine are reprensented by the three integers(1 to 3).
Alcohol
Malic acid
Ash
Alcalinity of ash
Magnesium
Total phenols
Flavanoids
Nonflavanoid phenols
Proanthocyanins
Color intensity
Hue
OD280/OD315 of diluted wines
Proline
http://archive.ics.uci.edu/ml/datasets/Wine.