这行代码是啥意思?以及如何创建此类的对象?
Posted
技术标签:
【中文标题】这行代码是啥意思?以及如何创建此类的对象?【英文标题】:What's the meaning of this line of code? And how can I create an object of this class?这行代码是什么意思?以及如何创建此类的对象? 【发布时间】:2021-10-30 06:56:56 【问题描述】:我试图构造一个 MTree 类的对象 (https://github.com/Waikato/moa/blob/master/moa/src/main/java/moa/clusterers/outliers/utils/mtree/MTree.java)
MTree 的构造函数如下所示:
public MTree(DistanceFunction<? super DATA> distanceFunction,
SplitFunction<DATA> splitFunction)
this(DEFAULT_MIN_NODE_CAPACITY, distanceFunction, splitFunction);
这里的DistanceFunction是一个接口,它的代码是:
/**
* An object that can calculate the distance between two data objects.
*
* @param <DATA> The type of the data objects.
*/
public interface DistanceFunction<DATA>
double calculate(DATA data1, DATA data2);
它的实现是:
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* Some pre-defined implementations of @linkplain DistanceFunction distance
* functions.
*/
public final class DistanceFunctions
/**
* Don't let anyone instantiate this class.
*/
private DistanceFunctions()
/**
* Creates a cached version of a @linkplain DistanceFunction distance
* function. This method is used internally by @link MTree to create
* a cached distance function to pass to the @linkplain SplitFunction split
* function.
* @param distanceFunction The distance function to create a cached version
* of.
* @return The cached distance function.
*/
public static <Data> DistanceFunction<Data> cached(final DistanceFunction<Data> distanceFunction)
return new DistanceFunction<Data>()
class Pair
Data data1;
Data data2;
public Pair(Data data1, Data data2)
this.data1 = data1;
this.data2 = data2;
@Override
public int hashCode()
return data1.hashCode() ^ data2.hashCode();
@Override
public boolean equals(Object arg0)
if(arg0 instanceof Pair)
Pair that = (Pair) arg0;
return this.data1.equals(that.data1)
&& this.data2.equals(that.data2);
else
return false;
private final Map<Pair, Double> cache = new HashMap<Pair, Double>();
@Override
public double calculate(Data data1, Data data2)
Pair pair1 = new Pair(data1, data2);
Double distance = cache.get(pair1);
if(distance != null)
return distance;
Pair pair2 = new Pair(data2, data1);
distance = cache.get(pair2);
if(distance != null)
return distance;
distance = distanceFunction.calculate(data1, data2);
cache.put(pair1, distance);
cache.put(pair2, distance);
return distance;
;
/**
* An interface to represent coordinates in Euclidean spaces.
* @see <a href="http://en.wikipedia.org/wiki/Euclidean_space">"Euclidean
* Space" article at Wikipedia</a>
*/
public interface EuclideanCoordinate
/**
* The number of dimensions.
*/
int dimensions();
/**
* A method to access the @code index-th component of the coordinate.
*
* @param index The index of the component. Must be less than @link
* #dimensions().
*/
double get(int index);
/**
* Calculates the distance between two @linkplain EuclideanCoordinate
* euclidean coordinates.
*/
public static double euclidean(EuclideanCoordinate coord1, EuclideanCoordinate coord2)
int size = Math.min(coord1.dimensions(), coord2.dimensions());
double distance = 0;
for(int i = 0; i < size; i++)
double diff = coord1.get(i) - coord2.get(i);
distance += diff * diff;
distance = Math.sqrt(distance);
return distance;
/**
* A @linkplain DistanceFunction distance function object that calculates
* the distance between two @linkplain EuclideanCoordinate euclidean
* coordinates.
*/
public static final DistanceFunction<EuclideanCoordinate> EUCLIDEAN = new DistanceFunction<DistanceFunctions.EuclideanCoordinate>()
@Override
public double calculate(EuclideanCoordinate coord1, EuclideanCoordinate coord2)
return DistanceFunctions.euclidean(coord1, coord2);
;
/**
* A @linkplain DistanceFunction distance function object that calculates
* the distance between two coordinates represented by @linkplain
* java.util.List lists of @link java.lang.Integers.
*/
public static final DistanceFunction<List<Integer>> EUCLIDEAN_INTEGER_LIST = new DistanceFunction<List<Integer>>()
@Override
public double calculate(List<Integer> data1, List<Integer> data2)
class IntegerListEuclideanCoordinate implements EuclideanCoordinate
List<Integer> list;
public IntegerListEuclideanCoordinate(List<Integer> list) this.list = list;
@Override public int dimensions() return list.size();
@Override public double get(int index) return list.get(index);
;
IntegerListEuclideanCoordinate coord1 = new IntegerListEuclideanCoordinate(data1);
IntegerListEuclideanCoordinate coord2 = new IntegerListEuclideanCoordinate(data2);
return DistanceFunctions.euclidean(coord1, coord2);
;
/**
* A @linkplain DistanceFunction distance function object that calculates
* the distance between two coordinates represented by @linkplain
* java.util.List lists of @link java.lang.Doubles.
*/
public static final DistanceFunction<List<Double>> EUCLIDEAN_DOUBLE_LIST = new DistanceFunction<List<Double>>()
@Override
public double calculate(List<Double> data1, List<Double> data2)
class DoubleListEuclideanCoordinate implements EuclideanCoordinate
List<Double> list;
public DoubleListEuclideanCoordinate(List<Double> list) this.list = list;
@Override public int dimensions() return list.size();
@Override public double get(int index) return list.get(index);
;
DoubleListEuclideanCoordinate coord1 = new DoubleListEuclideanCoordinate(data1);
DoubleListEuclideanCoordinate coord2 = new DoubleListEuclideanCoordinate(data2);
return DistanceFunctions.euclidean(coord1, coord2);
;
我的第一个问题是方法public static <Data> DistanceFunction<Data> cached(final DistanceFunction<Data> distanceFunction)
中return new DistanceFunction<Data>()
的含义是什么[方法在类DistanceFunctions中]我只是Java的初学者,这对我来说有点难以理解。
另外,要创建MTree的对象,我应该创建一个DistanceFunctions对象和一个ComposedSplitFunction对象(这是SplitFunction接口的实现)并将它们作为MTree构造函数的参数输入。但我真的不知道该怎么做,因为在 DistanceFunctions 类中,构造函数是私有的。所以我无法为 MTree 的构造函数生成参数。 我该怎么办?
新的更新:我想做的是为 MTree 创建一个 Junit 测试,我相信我需要做的第一件事就是创建一个 MTree 的对象。
【问题讨论】:
DistanceFunction
是一个接口,你可以写一个类来实现它(其实一个lambda就可以了)。您还可以通过 EUCLEDIAN 等静态距离函数之一。很难就如何继续向您提供实际建议,因为您没有说出要对您尝试创建的 mtree 做什么。
我要做的是为MTree创建一个Junit Test,我相信我需要做的第一件事就是创建一个MTree的对象,这样我以后可以做更多的事情,对吧?跨度>
【参考方案1】:
接口可以有多个实现。它们只是形成需要遵循的一般合同实施。
这里的cache
实现,即采用DistanceFunction
作为输入,并保证A 和B(或B 和A)之间的距离值只计算一次,然后从内部cache
映射中提供。 cache
函数的泛型类型只是保证您可以将任何类型传递给它。 IE。你可以有一个以最简单的形式只接受两个整数并计算它们的差的实现,如下所示:
DistanceFunction<Integer> func = (Integer a, Integer b) -> Math.abs(a - b);
这是一个labmda表达式,也可以像这样写得更冗长
DistanceFunction<Integer> func = new DistanceFunction<Integer>()
@Override
public double calculate(Integer data1, Integer data2)
return Math.abs(data1 - data2);
;
然后像这样使用它来缓存提供的输入参数的返回值:
DistanceFunction<Integer> cache = DistanceFunctions.cached(func);
double distance = cache.calculate(10, 5);
如果你以后有这样的电话
distance = cache.calculate(10, 5);
再次甚至
distance = cache.calculate(5, 10);
上述情况下的距离值不会重新计算,但它的值是从内部cache
映射返回的,因为这些参数的距离之前已经计算过了。如果您有大量数据点但只有有限数量的这些数据点的组合并且计算相当昂贵,这将特别有用。
如果您进一步查看您提供的DistanceFunctions
类,您会发现它已经提供了一些实现,例如EUCLIDEAN
、EUCLIDEAN_INTEGER_LIST
和EUCLIDEAN_DOUBLE_LIST
实现,由于它们的静态最终性质可以使用直接在您的代码中作为常量。在这里,您只需根据您选择的实现为calculate(...)
方法提供匹配的输入参数。
关于怀卡托的MTree` initialization,粗略的模板可能如下所示:
MTree mTree = new MTree(EUCLIDEAN_INTEGER_LIST, new SplitFunction<List<Integer>>(...)
...
@Override
public SplitResult<List<Integer>> process(Set<List<Integer>> dataSet, DistanceFunction<? super List<Integer>> distanceFunction)
Pair<List<Integer>> promoted = ...
Pair<Set<List<Integer>>> partitions = ...
return new SplitResult<List<Integer>>(promoted, partitions);
);
...
列出的部分需要由您定义和实现。该包中的代码虽然提供了一个 ComposedSplitFunction 实现,它需要 PartitionFunction 和 PromotionFunction 作为输入,其中这些实现已经在 PartitionFunctions 和 PromotionFunctions 类中可用,它们的工作方式与DistanceFunction
和 DistanceFunctions
在这里讨论。
【讨论】:
以上是关于这行代码是啥意思?以及如何创建此类的对象?的主要内容,如果未能解决你的问题,请参考以下文章