是否有可能使用java将字符串时间戳转换为浮点或日期时间

Posted 2023-03-12

技术标签:

【中文标题】是否有可能使用java将字符串时间戳转换为浮点或日期时间【英文标题】：Is there any possibility to convert string timestamp to float or datetime using java 【发布时间】：2021-01-08 10:31:07 【问题描述】：

我正在编写一个 java 代码，它生成从 1 到 1000 的随机数和时间戳。我已经用以下源代码表示时间戳

DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:sss");
Date date = new Date();
String a=dateFormat.format(date);
System.out.println(a);

我能够将数据存储为 .txt 文件，其中包含 1000 个随机数及其相应的时间戳当我尝试使用 pandas 数据框在 python 中加载特定的 .txt 文件时。该文件已成功加载，并与数据框一起显示，如下所示，

    HR   Age    RR  SPo2    Temperature     Timestamp
0   89   70     15  100     36  2020/09/22 12:46:009
1   130  27     15  96      37  2020/09/22 12:46:009
2   93   47     13  100     36  2020/09/22 12:46:009
3   116  53     15  98      36  2020/09/22 12:46:009
4   100  63     14  98      36  2020/09/22 12:46:009

之后，我尝试在训练/测试拆分后拟合随机森林：

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) 
from sklearn.ensemble import RandomForestClassifier

classifier=RandomForestClassifier(n_estimators=100, criterion='gini', random_state=1, max_depth=3)
classifier.fit(X_train,y_train)

但我收到一个错误：

ValueError                                Traceback (most recent call last)
<ipython-input-52-8f779aefd162> in <module>
     20 #Create a Gaussian Classifier
     21 classifier=RandomForestClassifier(n_estimators=100, criterion='gini', random_state=1, max_depth=3)
---> 22 classifier.fit(X_train,y_train)
     23 
     24 #y_pred=classifier.predict(X_test)

~/anaconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py in fit(self, X, y, sample_weight)
    293         """
    294         # Validate or convert input data
--> 295         X = check_array(X, accept_sparse="csc", dtype=DTYPE)
    296         y = check_array(y, accept_sparse='csc', ensure_2d=False, dtype=None)
    297         if sample_weight is not None:

~/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    529                     array = array.astype(dtype, casting="unsafe", copy=False)
    530                 else:
--> 531                     array = np.asarray(array, order=order, dtype=dtype)
    532             except ComplexWarning:
    533                 raise ValueError("Complex data not supported\n"

~/anaconda3/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: could not convert string to float: '2020/09/22 12:46:009'

我对此感到非常困惑。谁能帮我摆脱这个问题？

【问题讨论】：

【参考方案1】：

这里的问题是您必须将分类数据（日期）编码为数字数据，因为分类器无法处理您的日期，但需要数字。

您可以使用 sklearn 中的 OneHotEncoder 来处理所有日期，然后再将数据传递到分类器中。

但正如 here 所提到的，保持日期的循环性质会很有用：

您希望保留输入的周期性。一种方法就是将datetime变量切割成四个变量：年、月、日、和小时。然后，将这些（年份除外）变量中的每一个分解为两个。

您为这三个变量中的每一个创建一个正弦和余弦面（即月、日、小时），这将保留 24 小时是更接近 0 小时而不是 21 小时，并且该月 12 更接近月份 1 比第 10 个月。

所以本质上，您需要考虑如何将日期时间转换为数字，以便分类器可以使用它。

【讨论】：

以上是关于是否有可能使用java将字符串时间戳转换为浮点或日期时间的主要内容，如果未能解决你的问题，请参考以下文章

将 Unix 时间戳转换为日期字符串

何时使用 trunc() 而不是 int() 将浮点类型数转换为整数更好？

将时间戳（以毫秒为单位）转换为 Java 中的字符串格式时间

在 Hive 中将字符串转换为日期/时间戳

将小时和分钟字符串转换为纪元（unix 时间戳）

Java：如何将 UTC 时间戳转换为本地时间？