Clustering algorithm - Data Mining

There are a few parameters for the Microsoft Sequence Clustering algorithm. These parameters are used to control the cluster count, sequence states, and so on. By adjusting these parameter settings, we can fine-tune the model’s accuracy. The following is the list of the algorithm parameters:

  • Cluster_Count: The definition of Cluster_Count in the Microsoft Sequence Clustering algorithm is same as in the Microsoft Clustering algorithm. It defines the number of clusters a model contains. Setting this value to 0 will cause the algorithm to automatically choose the best number of clusters for predictive purpose. The default value for Cluster_Count is 0.
  • Minimum_Support: The definition of Minimum_Support in the Microsoft Sequence Clustering algorithm is the same as in the Microsoft Clustering algorithm. It is an integer. It specifies the minimum number of cases in each cluster to avoid having clusters with too few cases. The default value is 10.
  • Maximum_States: The definition of Maximum_States is the same as in the Microsoft Clustering algorithm. This parameter specifies the maximum number of states of a clustering algorithm attribute. This parameter is integer type. The default value is 100; attributes with more than 100 states invoke feature selection.
  • Maximum_Sequence_States: Maximum_Sequence_States defines the maximum number of states in the sequence attribute. It is integer type, with default value 64. Users can overwrite this value. If the sequence data has more states than Maximum_Sequence_States, feature selection is invoked, and the selection is based on the popularity of the states in the marginal model.

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Data Mining Topics