深度学习100例 | 第44天:密码破译
Posted K同学啊
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了深度学习100例 | 第44天:密码破译相关的知识,希望对你有一定的参考价值。
- 🚩 本文作者:K同学啊
- 🥇 精选专栏:《深度学习100例》
- 🔥 推荐专栏:《新手入门深度学习》
- 📚 选自专栏:《Matplotlib教程》
- 🧿 优秀专栏:《Python入门100题》
大家好,我是K同学啊!
今天大家一起做一个深度学习在密码破译方面的应用,本文仅供参考学习,请勿用作其他用途!本文重点如下:
- 读取并合并Excel文件中的多个子表
- 构建一个单输入多输出模型
- 单输入多输出与单输入单输出DL程序在构建流程上有何异同
我们程序的目标是 实现由 公开哈希
预测 秘密哈希
与 secret_salt
。
🚀 我的环境:
- 语言环境:Python3.6.5
- 编译器:jupyter notebook
- 深度学习环境:TensorFlow2.4.1
- 本文数据:公众号(K同学啊)内回复
DL+44
可以获取数据 - 项目代码:已全部放在文中,按顺序copy即可
如果你是一名深度学习小白可以先看看我这个专门为你写的专栏:《小白入门深度学习》
我们的代码流程图如下所示:
文章目录
一、前期准备工作
1. 导入数据
这里注意学习一下如何读取Excel文件,并合并Excel文件中的多个子表。
import tensorflow as tf
import pandas as pd
import numpy as np
gpus = tf.config.list_physical_devices("GPU")
if gpus:
tf.config.experimental.set_memory_growth(gpus[0], True) #设置GPU显存用量按需使用
tf.config.set_visible_devices([gpus[0]],"GPU")
print(gpus)
dataframe = pd.read_excel("哈希数据.xls",sheet_name=[0,1,2,3],names=["uid","公开哈希","秘密哈希","secret_salt"])
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
# 现将表构成list,然后在作为concat的输入
frames = [dataframe[0], dataframe[1], dataframe[2],dataframe[3]]
result = pd.concat(frames)
#记得重置索引(index)!不然后面容易出问题,我曾经因为这个找了一个上午的bug。
df = result.reset_index(drop=True)
df
2. 检查数据
查看公开哈希、秘密哈希、secret_salt三者长度是多少,是否会发生变化。
len(df.iloc[1,1]),len(df.iloc[1,2]),len(df.iloc[1,3])
(64, 128, 64)
len(df.iloc[2,1]),len(df.iloc[2,2]),len(df.iloc[2,3])
(64, 128, 64)
len(df.iloc[21,1]),len(df.iloc[21,2]),len(df.iloc[21,3])
(64, 128, 64)
二、构建训练集与测试集
X_ = df.iloc[:,1]
y_1 = df.iloc[:,2]
y_2 = df.iloc[:,3]
将标签数字化
注意到公开哈希、秘密哈希、secret_salt这些都是有10个阿拉伯数字与26个英文字母组成的且每一个字段长度固定,故而这里将每个字段转化成one-hot编码。
number = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
alphabet = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
char_set = number + alphabet
char_set_len = len(char_set)
y1_name_len = len(y_1[0])
y2_name_len = len(y_2[0])
# 将字符串数字化
def text2vec(text,label_name_len):
vector = np.zeros([label_name_len, char_set_len])
for i, c in enumerate(text):
idx = char_set.index(c)
vector[i][idx] = 1.0
return vector
y1_list = [text2vec(i,len(y_1[0])) for i in y_1]
y2_list = [text2vec(i,len(y_2[0])) for i in y_2]
X_list = [text2vec(i,len(X_[0])) for i in X_]
X_train = np.array(X_list[:50000])
y1_train = np.array(y1_list[:50000])
y2_train = np.array(y2_list[:50000])
X_train = X_train.reshape(X_train.shape[0],X_train.shape[1],X_train.shape[2],1)
X_train.shape,y1_train.shape,y2_train.shape
((50000, 64, 36, 1), (50000, 128, 36), (50000, 64, 36))
三、构建模型
这次我们构建的模型与以往有很大的不同,由于我们的目标是由 公开哈希
预测 秘密哈希
与 secret_salt
,所以这次构建的是一个单输入多输出模型,注意观察模型的构建方式。
更多的理论基础请看考 新手入门深度学习 | 4-1:构建模型的两种方法 一文中Model模型部分。
from tensorflow.keras import datasets, layers, models
from keras.layers import Input, Dense
from keras.models import Model
deep_inputs = Input(shape=(X_train.shape[1], X_train.shape[2],1))
x = layers.Conv2D(32,(3,3),activation='relu')(deep_inputs)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Conv2D(128, (3,3), activation='relu')(x)
x = layers.MaxPooling2D((2,2), strides=(2,2))(x)
x = layers.Flatten()(x)
x = layers.Dense(2000, activation='relu')(x)
# 模型输出1
out1 = layers.Dense(y1_name_len * char_set_len)(x)
out1 = layers.Reshape([y1_name_len, char_set_len])(out1)
out1 = layers.Softmax(name="out1")(out1)
# 模型输出2
out2 = layers.Dense(y2_name_len * char_set_len)(x)
out2 = layers.Reshape([y2_name_len, char_set_len])(out2)
out2 = layers.Softmax(name="out2")(out2)
model = Model(inputs=deep_inputs, outputs=[out1,out2])
# 打印网络结构
model.summary()
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, 64, 36, 1)] 0
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 62, 34, 32) 320 input_2[0][0]
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D) (None, 31, 17, 32) 0 conv2d_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 29, 15, 64) 18496 max_pooling2d_3[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 14, 7, 64) 0 conv2d_4[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 12, 5, 128) 73856 max_pooling2d_4[0][0]
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D) (None, 6, 2, 128) 0 conv2d_5[0][0]
__________________________________________________________________________________________________
flatten_1 (Flatten) (None, 1536) 0 max_pooling2d_5[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 2000) 3074000 flatten_1[0][0]
__________________________________________________________________________________________________
dense_4 (Dense) (None, 4608) 9220608 dense_3[0][0]
__________________________________________________________________________________________________
dense_5 (Dense) (None, 2304) 4610304 dense_3[0][0]
__________________________________________________________________________________________________
reshape_2 (Reshape) (None, 128, 36) 0 dense_4[0][0]
__________________________________________________________________________________________________
reshape_3 (Reshape) (None, 64, 36) 0 dense_5[0][0]
__________________________________________________________________________________________________
out1 (Softmax) (None, 128, 36) 0 reshape_2[0][0]
__________________________________________________________________________________________________
out2 (Softmax) (None, 64, 36) 0 reshape_3[0][0]
==================================================================================================
Total params: 16,997,584
Trainable params: 16,997,584
Non-trainable params: 0
__________________________________________________________________________________________________
四、编译模型
关于损失函数的选择可以参考文章:新手入门深度学习 | 3-4:损失函数Loss
optimizer = tf.keras.optimizers.Adam(1e-4)
model.compile(optimizer=optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
五、训练模型
epochs = 200
history = model.fit(
X_train,
(y1_train,y2_train),
validation_data=(X_train,(y1_train,y2_train)),
epochs=epochs
)
Epoch 1/200
1563/1563 [==============================] - 15s 8ms/step - loss: 5.5879 - softmax_loss: 2.7939 - softmax_1_loss: 2.7941 - softmax_accuracy: 0.0624 - softmax_1_accuracy: 0.0626 - val_loss: 5.5553 - val_softmax_loss: 2.7776 - val_softmax_1_loss: 2.7777 - val_softmax_accuracy: 0.0632 - val_softmax_1_accuracy: 0.0630
Epoch 2/200
1563/1563 [==============================] - 9s 6ms/step - loss: 5.5535 - softmax_loss: 2.7767 - softmax_1_loss: 2.7768 - softmax_accuracy: 0.0626 - softmax_1_accuracy: 0.0627 - val_loss: 5.5507 - val_softmax_loss: 2.7755 - val_softmax_1_loss: 2.7752 - val_softmax_accuracy: 0.0632 - val_softmax_1_accuracy: 0.0635
Epoch 3/200
1563/1563 [==============================] - 9s 6ms/step - loss: 5.5498 - softmax_loss: 2.7750 - softmax_1_loss: 2.7748 - softmax_accuracy: 0.0626 - softmax_1_accuracy: 0.0633 - val_loss: 5.5476 - val_softmax_loss: 2.7739 - val_softmax_1_loss: 2.7738 - val_softmax_accuracy: 0.0638 - val_softmax_1_accuracy: 0.0643
......
Epoch 199/200
1563/1563 [==============================] - 9s 6ms/step - loss: 3.6113 - softmax_loss: 2.3187 - softmax_1_loss: 1.2925 - softmax_accuracy: 0.2610 - softmax_1_accuracy: 0.5860 - val_loss: 3.4486 - val_softmax_loss: 2.2581 - val_softmax_1_loss: 1.1905 - val_softmax_accuracy: 0.2817 - val_softmax_1_accuracy: 0.6332
Epoch 200/200
1563/1563 [==============================] - 9s 6ms/step - loss: 3.6094 - softmax_loss: 2.3188 - softmax_1_loss: 1.2906 - softmax_accuracy: 0.2608 - softmax_1_accuracy: 0.5865 - val_loss: 3.4469 - val_softmax_loss: 2.2583 - val_softmax_1_loss: 1.1886 - val_softmax_accuracy: 0.2815 - val_softmax_1_accuracy: 0.6342
本文的重点在于走通密码破译这个流程,如果你对这方面感兴趣可以私下自己研究哈,这里就不方便提供准确率过高的算法模型了。
🥤 点击我 获取一对一辅导 🥤
以上是关于深度学习100例 | 第44天:密码破译的主要内容,如果未能解决你的问题,请参考以下文章
深度学习100例 | 第29天-ResNet50模型:船型识别
深度学习100例 | 第29天-ResNet50模型:船型识别
深度学习100例 - 卷积神经网络(Inception V3)识别手语 | 第13天
深度学习100例-卷积神经网络(CNN)猴痘病识别 | 第45天