[vtkusers] K-means values
Sara Rolfe
smrolfe at u.washington.edu
Tue Mar 15 18:51:44 EDT 2011
Hi David,
Thanks for your reply. Right now I'm using vtkKmeansStatistics
without learning and am following the example here:
http://www.vtk.org/Wiki/VTK/Examples/InfoVis/KMeansClustering
The output that I get using kMeansStatistics->GetOutput()->Dump()
shows the original value, the distance to the nearest cluster, and
cluster id it is assigned to, instead of the cluster mean.
+-----------------+-----------------+------------------+
| Magnitude | distance (0) | closest id (0) |
+-----------------+-----------------+------------------+
| 0.0657005 | 6.44972e-06 | 4 |
| 0.0652216 | 4.24651e-06 | 4 |
| 0.0646891 | 2.33557e-06 | 4 |
| 0.0641142 | 9.08931e-07 | 4 |
| 0.0635069 | 1.19747e-07 | 4 |
| 0.0666587 | 1.2235e-05 | 4 |
I think I will probably use learning, but I'd like to get it working
without first.
Thanks,
Sara
On Mar 15, 2011, at 3:27 PM, Thompson, David C wrote:
> Hi Sara,
>
>> I'm using vtkKmeansStatistics to successfully cluster data points.
>> However, I'm missing how you access the actual cluster mean values,
>> instead of just their labels. It looks like the order of the labels
>> may not correspond to the values of the means, is this true?
>
> I'm not clear on what you mean by "label". I've run the filter on
> data with 2 columns (named x & y) and with 2 sets of initial cluster
> center coordinates specified on the LEARN_PARAMETERS input: one for
> k=2 and one for k=3. I get this table:
>
> +----------------+----------------+----------------+----------------
> +----------------+----------------+-----------------+
> | Run ID | k | Iterations | Error
> | Cardinality | x | y |
> +----------------+----------------+----------------+----------------
> +----------------+----------------+-----------------+
> | 0 | 2 | 3 | 1528.94
> | 772 | 0.166201 | 0.12059 |
> | 0 | 2 | 3 | 498.266
> | 228 | 2.79467 | 2.99856 |
> | 1 | 3 | 15 | 546.596
> | 397 | -0.341883 | -0.486857 |
> | 1 | 3 | 15 | 546.946
> | 405 | 0.758854 | 0.855424 |
> | 1 | 3 | 15 | 381.077
> | 198 | 2.99941 | 3.14951 |
> +----------------+----------------+----------------+----------------
> +----------------+----------------+-----------------+
>
> as the first block of output 1 (i.e.,
> GetOutputDataObject( 1 ).GetBlock( 0 ).Dump() will produce the
> above). The first 2 rows contain the cluster mean values
> corresponding to the run with k=2 and the final 3 rows have the same
> for the run with k=3. Because there are 2 coordinates (x & y) for
> each cluster center, there is no good way to order cluster centers
> by their means. Instead, their order matches the initial guesses at
> cluster centers specified on the LEARN_PARAMETERS input if it
> exists. Otherwise, the order is random because the initial guesses
> are produced randomly. Is this what you wanted to know?
>
> David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.vtk.org/pipermail/vtkusers/attachments/20110315/0e5160e9/attachment.htm>
More information about the vtkusers
mailing list