Examples of K-parametric clustering algorithms:
- K-Means: sum of squared Euclidean distance is the objective function ; K-Means performs a heuristic optimization to minimize an objective function: the within-cluster sum of squares (WCSS), also known as inertia.
- K-Medoids (PAM)
- Gaussian Mixture Models (GMMs) – though they model distributions, you still need to define k
- Spectral Clustering – often requires k for the number of eigenvectors/clusters
- 1D K-means with dynamic programming : within cluster sum of square is the objective function just as k means (https://medium.com/@andreys95/optimal-1d-k-means-with-dynamic-programming-4d6ff57b6244) Its applications are
1. Image and Signal Processing
- Edge detection in 1D signals: Identifying abrupt changes in intensity, such as in electrocardiogram (ECG) signals or sound waveforms.
- Grayscale image analysis: Clustering pixel intensities for thresholding and segmentation (e.g., Otsu’s method).
2. Anomaly Detection
- Outlier detection: Identifying unusual data points in a sequence such as in temperature logs, stock prices, or sensor readings.
- Network intrusion detection: Anomalous traffic volumes or latencies can be flagged using 1D clustering.
3. Finance and Economics
- Price segmentation: Grouping stock prices, customer expenditures, or transaction amounts into clusters for analysis or marketing.
- Economic indicator binning: Simplifying complex metrics like inflation rates or GDP growth into meaningful ranges.
4. Healthcare and Medicine
- Vital sign monitoring: Clustering heartbeat intervals, glucose levels, or other biometric time series to identify normal vs. abnormal ranges.
- Dosage grouping: Categorizing drug dosages for different patient groups or treatment levels.
5. Industrial and IoT Applications
- Sensor data clustering: Classifying temperature, vibration, or pressure readings for predictive maintenance.
- Energy usage analysis: Segmenting power consumption values to optimize resource distribution.
6. Education and Testing
- Score grading: Clustering test scores to assign grades or identify performance bands.
- Learning analytics: Grouping students by time spent or attempts on a quiz for intervention strategies.
7. Natural Language Processing (NLP)
- Word length or frequency clustering: Used in stylometric analysis or feature engineering in text mining.
8. Retail and Marketing
- Customer segmentation: Based on a single metric like frequency of purchase or average order value.
- Pricing strategy: Grouping products by their price points for tiered marketing approaches.
Examples of Nonparametric clustering algorithms:
- DBSCAN – defines clusters based on density, not a fixed k
- OPTICS – an extension of DBSCAN, good for varying densities
- Mean Shift – mode-seeking algorithm that finds clusters around data density peaks
- Hierarchical Clustering – builds a dendrogram that can be cut at any level to form clusters
=========
K-Means++ is a smarter way to initialize centroids for the K-Means algorithm. It improves both the accuracy and stability of clustering by reducing the chance of poor local minima. Standard K-Means randomly picks initial centroids, which can Lead to bad clusterings (poor local optima) & Require multiple restarts to get good results. K-Means++ Initialization Steps:
- Randomly select the first centroid \mu_1 from the dataset.
- For each data point x, compute the squared distance D(x)^2 to the nearest already chosen centroid.
- Select the next centroid with probability:
P(x) = \frac{D(x)^2}{\sum_{x’ \in X} D(x’)^2}
→ This favors points far from existing centroids. - Repeat steps 2–3 until k centroids are chosen.
- Run standard K-Means using these initialized centroids.
Example:
- You’ve selected 1 centroid: \mu_1
- You have 5 data points with distances to \mu_1:
D(x_1)^2 = 1,\quad D(x_2)^2 = 4,\quad D(x_3)^2 = 9,\quad D(x_4)^2 = 16,\quad D(x_5)^2 = 0.25 - Total = 1 + 4 + 9 + 16 + 0.25 = 30.25
Then the probability of picking x_4 as the next centroid is: P(x_4) = \frac{16}{30.25} \approx 0.529