winsorize(X, limit, [inclusive=true], [nanPolicy=’upper’])
X is a vector.
limit is a scalar or a vector with 2 elements indicating the percentages to cut on each side of X, with respect to the number of unmasked data, as floats between 0 and 1. If limit is a scalar, it means the percentages to cut on both sides of X. (size(X)*`limit`) smallest elements and (size(X)*`limit`) largest data are masked, and the total number of unmasked data after trimming is n*(1-sum(limits)). The value of one element of limit can be set to 0 to indicate no masking is conducted on this side.
inclusive is a Boolean type scalar or a vector of 2 elements indicating whether the number of data being masked on each side should be truncated (true) or rounded (false).
nanPolicy is a string indicating how to handle NULL values. The following options are available (default is ‘upper’):
‘upper’: allows NULL values and treats them as the largest values of X.
‘lower’: allows NULL values and treats them as the smallest values of X.
‘raise’: throws an error
‘omit’: performs the calculations ignoring NULL values
Return a winsorized version of the input array.
$ x=1..10 winsorize(x, 0.1); [2,2,3,4,5,6,7,8,9,9] $ winsorize(x, 0.12 0.17); [2,2,3,4,5,6,7,8,9,9] $ winsorize(x, 0.12 0.17, inclusive=false); [2,2,3,4,5,6,7,8,8,8] $ x=1..20; $ x[19:]=NULL; $ x; [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,] $ winsorize(x, 0.1); [3,3,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,18,18] $ winsorize(x, 0.1, nanPolicy='upper'); [3,3,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,18,18] $ winsorize(x, 0.1, nanPolicy='lower'); [2,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,17,17,2]