A typical problem of pattern recognition consists in classification of objects. The object is represented by a vector of values called features. Objects with similar features are grouped into classes.
The classification of objects is made upon the analysis of their structure. A method of dynamic cluster analysis is used. This method is based on revealing a cluster of objects for each class in the space of features. The cluster as a set of objects with similar properties is contoured by a closed surface - multidimensional parallelepiped is used. The system needs preliminary training using samples. Training teaches the system to determine the class of the object under consideration.
The ordinary task of pattern recognition implies that the descriptions of objects to have a common set of features. Actually the sets of features of classes not always coincide. It was necessary to update the method of cluster analysis to make the system able to function with variable sets of features. The way of contouring the cluster by a multidimensional parallelepiped allows such modification of the method. According to a subset of features to be used for a class description, its parallelepiped is located in the corresponding subspace of common feature space.
A training process consists of two phases: entering a sample, and delimitating the classes.
Further, to estimate the considerate object, its features are to be compared with corresponding ranges in the class description, regardless of other features.
If the clusters corresponding to different classes are spaced apart from each other, the problem of classification is decided easily. If the clusters are overlapped in a large degree, a satisfactory recognition will not be possible.
A classified object is often referred to several clusters, when some of its features are missing. If such case the weights of the features may be used. The probability of an object to belong to a particular class of a differential set equals to the ratio of total weight of features of the object falling in the appropriate cluster, to general weight of features of the class.
If the object falls into area of interception of clusters, it is possible to allocate analogues from a training sample - objects of appropriate classes, that fall into the same area in a feature subspace of initial object. Their consideration can give the useful information on a direction of further research.
The UQQ is implemented on C++. It has a screen interface; objects may be entered through the import from Paradox and DBF files.
The system may be used as a tool of researcher. By mechanism of filtering it is possible to restrict a range of any feature, to discard the extremely high or low values, or to withdraw the feature at all. Statistics of falling objects to any class helps to estimate quality of recognition.