使用 PCA 或类似的东西从文本文件中获取集群分配的可视化?

Posted

技术标签:

【中文标题】使用 PCA 或类似的东西从文本文件中获取集群分配的可视化?【英文标题】:Using PCA or something similar to get a visualisation of cluster assignments from a text file? 【发布时间】:2020-04-26 09:43:38 【问题描述】:

我正在尝试执行 PCA、TSne 或其他某种降维技术,以从以下格式的文本文件中获取集群分配的可视化(其中显示的第一列是实例编号,显示的第二列是该实例属于哪个集群。可以这样做吗?关于我将如何做到这一点的任何建议都会很棒。

0   1
1   0
2   1
3   0
4   0
5   0
6   1
7   0
8   1
9   0
10  1
11  1
12  1
13  1
14  1
15  0

【问题讨论】:

您的数据有多大?正如您提到的,这些技术用于降维技术。一旦你这样做了,你就可以在一个较低的维度上对其进行可视化。 【参考方案1】:

我在 *** 上的第一个答案 (ahhh ^^')

有关 PCA 实施,请参见此处: PCA Implementation in Java

为了方便在 Java 中进行任何类型的绘图,我推荐 JavaFX (https://openjfx.io/)

这里是一些使用 JavaFX 创建绘图的示例代码:

    import javafx.application.Application;
    import javafx.collections.FXCollections;
    import javafx.collections.ObservableList;
    import javafx.geometry.Rectangle2D;
    import javafx.scene.Parent;
    import javafx.scene.Scene;
    import javafx.scene.chart.BarChart;
    import javafx.scene.chart.CategoryAxis;
    import javafx.scene.chart.LineChart;
    import javafx.scene.chart.NumberAxis;
    import javafx.scene.chart.PieChart;
    import javafx.scene.chart.XYChart;
    import javafx.scene.layout.HBox;
    import javafx.scene.layout.Pane;
    import javafx.scene.layout.VBox;
    import javafx.stage.Screen;
    import javafx.stage.Stage;

/*
 * DIFFERENT CHART TYPES IN JAVA FX:
 * 
 *  JavaFX PieChart
    JavaFX LineChart
    JavaFX Area Chart
    JavaFX BubbleChart
    JavaFX BarChart
    JavaFX ScatterChart
 * 
 */

public class MyCharts extends Application 

    private double screenFac = 1.25;
    private Rectangle2D screenSize;
    private double width;
    private double height;
    private Pane root;

    private Parent showMyCharts() 

        root = createRoot();

        /* we have to use NEW NumberAxis-Objects
         * for each BarChart
         * it is not possible to share the same NumberAxis-Object 
         * with different Bar charts
         */
        final NumberAxis xAxis1 = new NumberAxis();
        final NumberAxis xAxis2 = new NumberAxis();
        final NumberAxis xAxis3 = new NumberAxis();
        final CategoryAxis yAxis1 = new CategoryAxis();
        final CategoryAxis yAxis2 = new CategoryAxis();
        final CategoryAxis yAxis3 = new CategoryAxis();

        final BarChart<Number, String> bc1 = new BarChart<Number, String>(xAxis1, yAxis1);
        // here use other NumberAxis-Object 
//      final BarChart<Number, String> bc2 = new BarChart<Number, String>(xAxis2, yAxis2);

        LineChart<Number, String> l1 = new LineChart<>(xAxis2, yAxis2);


        // define BarChart settings
        bc1.setTitle("myBarChart");
        xAxis1.setLabel("Value");
        xAxis1.setTickLabelRotation(90);
        yAxis1.setLabel("Item");

        l1.setTitle("myLineChart");
        xAxis2.setLabel("Value");


        // create some Data series

        XYChart.Series<Number, String> series1 = new XYChart.Series<>();
        final String itemA = "A";
        final  String itemB = "B";
        final  String itemC = "F";

        series1.setName("2005");
        series1.getData().add(new XYChart.Data<Number, String>(45.0D, itemA));
        series1.getData().add(new XYChart.Data<Number, String>(55.0D, itemB));
        series1.getData().add(new XYChart.Data<Number, String>(75.0D, itemC));
//        series1.getData().add(new XYChart.Data(44, itemB));
//        series1.getData().add(new XYChart.Data(18, itemC));

        bc1.getData().add(series1);

        // also XYChart.Series cannot be shared; we have to create a new Object for the line chart

        XYChart.Series<Number, String> series2 = new XYChart.Series<>();
        series2.setName("2005");
        series2.getData().add(new XYChart.Data<Number, String>(45.0D, itemA));
        series2.getData().add(new XYChart.Data<Number, String>(55.0D, itemB));
        series2.getData().add(new XYChart.Data<Number, String>(75.0D, itemC));


        l1.getData().add(series2);

        /* piechart seems to be a bit special; you cann add the observeabel list directly if you create
         * the piechart or you have to create PieChart.Data(..) if you want to add the data after creating the piechart-object
         */
        ObservableList<PieChart.Data> pieChartData =
                FXCollections.observableArrayList(
                new PieChart.Data("Grapefruit", 13),
                new PieChart.Data("Oranges", 25),
                new PieChart.Data("Plums", 10),
                new PieChart.Data("Pears", 22),
                new PieChart.Data("Apples", 30));

        PieChart p2 = new PieChart(pieChartData);


        PieChart p1 = new PieChart();
        PieChart.Data slice1 = new PieChart.Data("Desktop", 213);
        PieChart.Data slice2 = new PieChart.Data("Phone"  , 67);
        PieChart.Data slice3 = new PieChart.Data("Tablet" , 36);

        p1.getData().add(slice1);
        p1.getData().add(slice2);
        p1.getData().add(slice3);


        HBox hbox = new HBox();

        VBox vBox1 = new VBox();
        VBox vBox2 = new VBox();

        vBox1.getChildren().addAll(l1, bc1);
        vBox2.getChildren().addAll(p1, p2);

        hbox.getChildren().addAll(vBox1, vBox2);

//        bc.getData().addAll(series1, series2, series3);

//      XYChart<>.Series series1 = new XYChart.Series();
//        series1.setName("2003");
//        series1.getData().add(new XYChart.Data(2, itemA));
//        series1.getData().add(new XYChart.Data(20, itemB));
//        series1.getData().add(new XYChart.Data(10, itemC));
//      

        root.getChildren().add(hbox);

        return root;
    

    private Pane createRoot() 

        screenSize = Screen.getPrimary().getBounds();
        width = screenSize.getWidth();
        height = screenSize.getHeight();

        root = new Pane(); // Pane is a layout without any grid o.s.e.
        root.setPrefSize(width / screenFac, height / screenFac);

        return root;
    

    @Override
    public void start(Stage stage) throws Exception 

        Scene scene = new Scene(showMyCharts());
        stage.setScene(scene);
        stage.show();

    

    public static void main(String[] charts) 

        launch(charts);

    


【讨论】:

以上是关于使用 PCA 或类似的东西从文本文件中获取集群分配的可视化?的主要内容,如果未能解决你的问题,请参考以下文章

使用“diff”(或其他任何东西)来获取文本文件之间的字符级差异

如何从freemarker中的字符串获取url

Core Data 使用类似于 BEGINSWITH 的东西为 NSNumber 创建 NSPredicate

如何使用 Selenium ChromeDriver 从 span 类中获取文本

获取范围/单元格对象的格式化文本

Ant 脚本 - 如何从 plist 文件中打印一些文本并将其分配给属性