使用 PCA 或类似的东西从文本文件中获取集群分配的可视化?
Posted
技术标签:
【中文标题】使用 PCA 或类似的东西从文本文件中获取集群分配的可视化?【英文标题】:Using PCA or something similar to get a visualisation of cluster assignments from a text file? 【发布时间】:2020-04-26 09:43:38 【问题描述】:我正在尝试执行 PCA、TSne 或其他某种降维技术,以从以下格式的文本文件中获取集群分配的可视化(其中显示的第一列是实例编号,显示的第二列是该实例属于哪个集群。可以这样做吗?关于我将如何做到这一点的任何建议都会很棒。
0 1
1 0
2 1
3 0
4 0
5 0
6 1
7 0
8 1
9 0
10 1
11 1
12 1
13 1
14 1
15 0
【问题讨论】:
您的数据有多大?正如您提到的,这些技术用于降维技术。一旦你这样做了,你就可以在一个较低的维度上对其进行可视化。 【参考方案1】:我在 *** 上的第一个答案 (ahhh ^^')
有关 PCA 实施,请参见此处: PCA Implementation in Java
为了方便在 Java 中进行任何类型的绘图,我推荐 JavaFX (https://openjfx.io/)
这里是一些使用 JavaFX 创建绘图的示例代码:
import javafx.application.Application;
import javafx.collections.FXCollections;
import javafx.collections.ObservableList;
import javafx.geometry.Rectangle2D;
import javafx.scene.Parent;
import javafx.scene.Scene;
import javafx.scene.chart.BarChart;
import javafx.scene.chart.CategoryAxis;
import javafx.scene.chart.LineChart;
import javafx.scene.chart.NumberAxis;
import javafx.scene.chart.PieChart;
import javafx.scene.chart.XYChart;
import javafx.scene.layout.HBox;
import javafx.scene.layout.Pane;
import javafx.scene.layout.VBox;
import javafx.stage.Screen;
import javafx.stage.Stage;
/*
* DIFFERENT CHART TYPES IN JAVA FX:
*
* JavaFX PieChart
JavaFX LineChart
JavaFX Area Chart
JavaFX BubbleChart
JavaFX BarChart
JavaFX ScatterChart
*
*/
public class MyCharts extends Application
private double screenFac = 1.25;
private Rectangle2D screenSize;
private double width;
private double height;
private Pane root;
private Parent showMyCharts()
root = createRoot();
/* we have to use NEW NumberAxis-Objects
* for each BarChart
* it is not possible to share the same NumberAxis-Object
* with different Bar charts
*/
final NumberAxis xAxis1 = new NumberAxis();
final NumberAxis xAxis2 = new NumberAxis();
final NumberAxis xAxis3 = new NumberAxis();
final CategoryAxis yAxis1 = new CategoryAxis();
final CategoryAxis yAxis2 = new CategoryAxis();
final CategoryAxis yAxis3 = new CategoryAxis();
final BarChart<Number, String> bc1 = new BarChart<Number, String>(xAxis1, yAxis1);
// here use other NumberAxis-Object
// final BarChart<Number, String> bc2 = new BarChart<Number, String>(xAxis2, yAxis2);
LineChart<Number, String> l1 = new LineChart<>(xAxis2, yAxis2);
// define BarChart settings
bc1.setTitle("myBarChart");
xAxis1.setLabel("Value");
xAxis1.setTickLabelRotation(90);
yAxis1.setLabel("Item");
l1.setTitle("myLineChart");
xAxis2.setLabel("Value");
// create some Data series
XYChart.Series<Number, String> series1 = new XYChart.Series<>();
final String itemA = "A";
final String itemB = "B";
final String itemC = "F";
series1.setName("2005");
series1.getData().add(new XYChart.Data<Number, String>(45.0D, itemA));
series1.getData().add(new XYChart.Data<Number, String>(55.0D, itemB));
series1.getData().add(new XYChart.Data<Number, String>(75.0D, itemC));
// series1.getData().add(new XYChart.Data(44, itemB));
// series1.getData().add(new XYChart.Data(18, itemC));
bc1.getData().add(series1);
// also XYChart.Series cannot be shared; we have to create a new Object for the line chart
XYChart.Series<Number, String> series2 = new XYChart.Series<>();
series2.setName("2005");
series2.getData().add(new XYChart.Data<Number, String>(45.0D, itemA));
series2.getData().add(new XYChart.Data<Number, String>(55.0D, itemB));
series2.getData().add(new XYChart.Data<Number, String>(75.0D, itemC));
l1.getData().add(series2);
/* piechart seems to be a bit special; you cann add the observeabel list directly if you create
* the piechart or you have to create PieChart.Data(..) if you want to add the data after creating the piechart-object
*/
ObservableList<PieChart.Data> pieChartData =
FXCollections.observableArrayList(
new PieChart.Data("Grapefruit", 13),
new PieChart.Data("Oranges", 25),
new PieChart.Data("Plums", 10),
new PieChart.Data("Pears", 22),
new PieChart.Data("Apples", 30));
PieChart p2 = new PieChart(pieChartData);
PieChart p1 = new PieChart();
PieChart.Data slice1 = new PieChart.Data("Desktop", 213);
PieChart.Data slice2 = new PieChart.Data("Phone" , 67);
PieChart.Data slice3 = new PieChart.Data("Tablet" , 36);
p1.getData().add(slice1);
p1.getData().add(slice2);
p1.getData().add(slice3);
HBox hbox = new HBox();
VBox vBox1 = new VBox();
VBox vBox2 = new VBox();
vBox1.getChildren().addAll(l1, bc1);
vBox2.getChildren().addAll(p1, p2);
hbox.getChildren().addAll(vBox1, vBox2);
// bc.getData().addAll(series1, series2, series3);
// XYChart<>.Series series1 = new XYChart.Series();
// series1.setName("2003");
// series1.getData().add(new XYChart.Data(2, itemA));
// series1.getData().add(new XYChart.Data(20, itemB));
// series1.getData().add(new XYChart.Data(10, itemC));
//
root.getChildren().add(hbox);
return root;
private Pane createRoot()
screenSize = Screen.getPrimary().getBounds();
width = screenSize.getWidth();
height = screenSize.getHeight();
root = new Pane(); // Pane is a layout without any grid o.s.e.
root.setPrefSize(width / screenFac, height / screenFac);
return root;
@Override
public void start(Stage stage) throws Exception
Scene scene = new Scene(showMyCharts());
stage.setScene(scene);
stage.show();
public static void main(String[] charts)
launch(charts);
【讨论】:
以上是关于使用 PCA 或类似的东西从文本文件中获取集群分配的可视化?的主要内容,如果未能解决你的问题,请参考以下文章
使用“diff”(或其他任何东西)来获取文本文件之间的字符级差异
Core Data 使用类似于 BEGINSWITH 的东西为 NSNumber 创建 NSPredicate