CSNN：基于汉语拼音与神经网络的口令集安全评估方法

咸鹤群; 张艺; 汪定; 李增鹏; 贺云龙

doi:10.11999/JEIT190856

CSNN：基于汉语拼音与神经网络的口令集安全评估方法

doi: 10.11999/JEIT190856 cstr: 32379.14.JEIT190856

咸鹤群^{1, 2, ,},
张艺^{1, 2},
汪定³,
李增鹏¹,
贺云龙¹

1.
青岛大学计算机科学技术学院青岛 266071
2.
中国科学院信息工程研究所信息安全国家重点实验室北京 100093
3.
南开大学网络空间安全学院天津 300350

基金项目: 国家自然科学基金(61802214)；山东省自然科学基金(ZR2019MF058)

详细信息

作者简介:
咸鹤群：男，1979年生，博士，副教授，主要研究方向为云计算安全、大数据安全、区块链安全、数据库安全等

张艺：女，1995年生，硕士，研究方向为云计算安全、密码学

汪定：男，1985年生，博士，教授，主要研究方向为口令安全、加密协议、可证明安全等

李增鹏：男，1989年生，博士，助理教授，主要研究方向为公钥密码学、密码协议与分布式安全计算

贺云龙：男，1999年生，学士，研究方向为云计算安全、密码学

通讯作者:
咸鹤群　xianhq@126.com

中图分类号: TP309
计量
- 文章访问数: 4253
- HTML全文浏览量: 1052
- PDF下载量: 107
- 被引次数: 0
出版历程
- 收稿日期: 2019-11-01
- 修回日期: 2020-02-25
- 网络出版日期: 2020-04-09
- 刊出日期: 2020-08-18

CSNN: Password Set Security Evaluation Method Based on Chinese Syllables and Neural Network

Hequn XIAN^{1, 2
, ,},
Yi ZHANG^{1, 2},
Ding WANG³,
Zengpeng LI¹,
Yunlong HE¹

1.
College of Computer Science and Technology, Qingdao University, Qingdao 266071, China
2.
State Key Laboratory of Information Security(Institute of Information Engineering, Chinese Academy of Sciences), Beijing 100093, China
3.
College of Cyber Science, Nankai University, Tianjin 300350, China

Funds: The National Natural Science Foundation of China (61802214), The Shandong Provincial Natural Science Foundation (ZR2019MF058)

摘要

摘要: 口令猜测攻击是一种最直接的获取信息系统访问权限的攻击，采用恰当方法生成的口令字典能够准确地评估信息系统口令集的安全性。该文提出一种针对中文口令集的口令字典生成方法(CSNN)。该方法将每个完整的汉语拼音视为一个整体元素，后利用汉语拼音的规则对口令进行结构划分与处理。将处理后的口令放入长短期记忆网络(LSTM)中训练，用训练后的模型生成口令字典。该文通过命中率实验评估CSNN方法的效能，将CSNN与其它两种经典口令生成方法(即，概率上下文无关文法PCFG和5阶马尔可夫链模型)对生成口令的命中率进行实验对比。实验选取了不同规模的字典，结果显示，CSNN方法生成的口令字典的综合表现优于另外两种方案。与概率上下文无关文法相比，在猜测数为10⁷时，CSNN字典在不同测试集上的命中率提高了5.1%～7.4%(平均为6.3%)；相对于5阶马尔可夫链模型，在猜测数为8×10⁵时，CSNN字典在不同测试集上的命中率提高了2.8%～12%(平均为8.2%)。
- 口令集安全评估 /
- 口令字典生成 /
- 神经网络 /
- 身份认证
Abstract: Password guessing attack is the most direct way to break information systems. Using appropriate methods to generate password dictionaries can accurately evaluate the security of password sets. This paper proposes a new approach to the Chinese password set security evaluation that is named Chinese Syllables and Neural Network-based password generation (CSNN). In CSNN, each chinese syllable is treated as an integral element, and the spelling rules of chinese syllable can be used to parse and process the passwords. The processed passwords are then trained in the neural network model of Long Short-Term Memory (LSTM), which is used to generate password dictionaries (guessing sets). To evaluate the performance of CSNN, the hit rates of guessing sets generated by CSNN is compared with the two classical approaches (i.e., Probability Context-Free Grammar (PCFG) and 5th-order Markov chain model). In the hit rate experiment, guessing sets of different scales are selected; the results show that the comprehensive performance of guessing sets generated by CSNN is better than PCFG and 5th-order markov chain model. Compared with PCFG, different scales of CSNN guessing sets can improve 5.1%～7.4% in hit rate on some test sets by 10⁷ guesses (average 6.3%); Compared with 5th-order markov chain model, the CSNN guessing sets increased its hit rate by 2.8% to 12% (with an average of 8.2%) by 8×10⁵ guesses.
- Password set security evaluation /
- Password dictionary generation /
- Neural Networks (NN) /
- Identity authentication

HTML全文

图 1 PCFG过程示例

下载: 全尺寸图片幻灯片

图 2 CSNN方法实现

下载: 全尺寸图片幻灯片

图 3 命中率结果

下载: 全尺寸图片幻灯片

图 4 不同口令生成方法在不同口令集上的命中率

下载: 全尺寸图片幻灯片

表 1 Structure Parsing算法

input: Training Set, allCSs
intermediate result: the structure of current password (thisStructure)
output: Password structure frequency table(Structure)
1 for password $ \in $ Training Set do
2 　if Array_alphaStrings ← match_alplaStrings(password) then
3 　　for alplaString $ \in $ Array_alphaString do
4 　　　i, e ← index(alplaString), end(alplaString)
5 　　　if CSs ← match_CSs(alplaString) then
6 　　　　Array_Ci, Array_Ce ← index(CSs), end(CSs)
7 　　　　Queue_append(thisStructure,'C', Array_Ci)
8 　　　　Array_Li ← getsubStringIndex(i,e,Array_Ci, Array_Ce)
9 　　　　Queue_append(thisStructure,'L', Array_Li)
10 　　end if
11 　　else
12 　　　Queue_append(thisStructure,'L', i)
13 　　end else
14 　end for
15 end if
16 if Array_digitStrings ← match_digitStrings(password) then
17 　Array_Di ← index(Array_digitStrings)
18 　Queue_append(thisStructure,'D', Array_Di)
19 end if
20 if Array_specialStrings← match_specialStrings(password) then
21 　Array_Si ← index(Array_specialStrings)
22 　Queue_append(thisStructure,'S', Array_Si)
23 end if
24 　Structure.add(thisStructure)
25 end for
26 Structure.frequency()
27 return Structure

下载: 导出CSV

表 2 Password Generation算法

input: $\Sigma $, M
output: Password dictionary
1 count ← 0
2 while count < scale do
3 　nowStr ← getStr_rand($\Sigma $)
4 　nowStr ← strCat(nowStr, EOF)
5 　incoPwd ← STA
6 　for seg $ \in $ nowStr do
7 　　if seg $ \in $ predict(M, incoPwd) then
8 　　　prediction ← selectSeg_rand(M, seg)
9 　　　tempPwd ← pwdCat(incoPwd, prediction)
10 　　　if len(printable(tempPwd)) <= Len
and weight(printable(tempPwd)) >= T then
11 　　　incoPwd ← tempPwd
12 　　　else
13 　　　　incoPwd ← NULL
14 　　　　break
15 　　　end if
16 　　else
17 　　　incoPwd ← NULL
18 　　　break
19 　　end if
20 　end for
21 　if end(incoPwd) == EOF then
22 　　dictionary.add(printable(incoPwd))
23 　　++count
24 　end if
25 end while
26 return dictionary

下载: 导出CSV

表 3 本文使用的口令集信息

口令集	服务类型	原始数量	使用数量	口令总量(占使用口令百分比)
口令集	服务类型	原始数量	使用数量	包含字母字符串	包含拼音	有2个及以上拼音相连	仅由拼音构成
嘟嘟牛	电子商务	16,258,260	12,494,033	8,856,456(70.9%)	3,606,968(28.9%)	1,079,000(8.6%)	1,752,575(14.0%)
CSDN	IT论坛	6,428,277	6,370,893	3,619,077(56.8%)	2,046,963(32.1%)	583,968(9.2%)	550,444(8.6%)
12306	铁路票务	129,303	129,303	95,373(73.8%)	39,544(30.6%)	10,861(8.4%)	17,146(13.2%)
网易邮箱	邮箱	1,220,088,121	20,630,312	11,532,344(55.9%)	5279116(25.6%)	18,30,575(8.9%)	2,018,686(10.6%)

下载: 导出CSV

表 4 各口令集中最流行的18个汉语拼音

口令集	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18
网易邮箱	wo	li	ai	wang	yu	ni	ng	xiao	zhang	wei	liu	ji	yang	xi	chen	wu	hu	ma
嘟嘟牛	wo	li	ai	ni	yu	wang	liu	xiao	zhang	wei	ng	ji	xu	chen	yang	hu	wu	xi
12306	wo	li	ai	ni	wang	yu	wei	xiao	liu	ji	zhang	ma	ng	chen	shi	an	yang	wu
CSDN	li	wo	de	yu	wang	ng	ji	liu	zhang	xiao	ai	wei	ma	xi	an	ni	chen	hu

下载: 导出CSV

表 5 口令结构分布频率(%)

排名	网易邮箱		嘟嘟牛		12306		CSDN
排名	结构	频率	结构	频率	结构	频率	结构	频率
1	D	43.5	LD	31.8	LD	30.1	D	42.7
2	LD	22.7	D	29.0	D	27.2	LD	14.8
3	CD	6.4	CD	11.2	CD	10.4	CD	5.6
4	LCD	4.9	DL	7.6	DL	9.3	LCD	5.3
5	DL	4.4	LCD	6.4	LCD	6.9	LC	4.5
6	LC	3.9	LC	2.3	CLD	2.1	DL	4.3
7	C	1.5	CLD	1.4	LC	2.1	LCL	2.7
8	DC	1.1	DC	1.2	LCLD	1.7	L	1.8
9	LCL	0.9	LCLD	1.1	DC	1.2	CLD	1.7
10	CLD	0.9	C	1.0	LDL	1.1	LCLD	1.7

下载: 导出CSV

参考文献(21)

王勇, 吴金君, 田增山, 等. 基于FMCW雷达的多维参数手势识别算法[J]. 电子与信息学报, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485

WANG Yong, WU Jinjun, TIAN Zengshan, et al. Gesture recognition with multi-dimensional parameter using FMCW radar[J]. Journal of Electronics &Information Technology, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485

马杰, 张绣丹, 杨楠, 等. 融合密集卷积与空间转换网络的手势识别方法[J]. 电子与信息学报, 2018, 40(4): 951–956. doi: 10.11999/JEIT170627

MA Jie, ZHANG Xiudan, YANG Nan, et al. Gesture recognition method combining dense convolutional with spatial transformer networks[J]. Journal of Electronics &Information Technology, 2018, 40(4): 951–956. doi: 10.11999/JEIT170627

王平, 汪定, 黄欣沂. 口令安全研究进展[J]. 计算机研究与发展, 2016, 53(10): 2173–2188. doi: 10.7544/issn1000-1239.2016.20160483

WANG Ping, WANG Ding, and HUANG Xinyi. Advances in password security[J]. Journal of Computer Research and Development, 2016, 53(10): 2173–2188. doi: 10.7544/issn1000-1239.2016.20160483

MORRIS R and THOMPSON K. Password security: A case history[J]. Communications of the ACM, 1979, 22(11): 594–597. doi: 10.1145/359168.359172

WU T. A real-world analysis of Kerberos password security[C]. 1999 Network and Distributed System Security Symposium, San Diego, USA, 1999: 13–22.

KLEIN D V. Foiling the cracker: A survey of, and improvements to, password security[J]. Programming and Computer Software, 1992, 17(3): 5–14.

HOCHREITER S and SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735–1780. doi: 10.1162/neco.1997.9.8.1735

LEVY O, LEE K, FITZGERALD N, et al. Long Short-term memory as a dynamically computed element-wise weighted sum[J]. 2018, arXiv: 1805.03716.

MELICHER W, UR B, SEGRETI S M, et al. Fast, lean, and accurate: Modeling password guessability using neural networks[C]. The 25th USENIX Security Symposium, Austin, USA, 2016: 175–191.

WEIR M, AGGARWAL S, DE MEDEIROS B, et al. Password cracking using probabilistic context-free grammars[C]. The 30th IEEE Symposium on Security and Privacy, Berkeley, USA, 2009: 391–405. doi: 10.1109/SP.2009.8.

NARAYANAN A and SHMATIKOV V. Fast dictionary attacks on passwords using time-space tradeoff[C]. The 12th ACM Conference on Computer and Communications Security, New York, USA, 2005: 364–372. doi: 10.1145/1102120.1102168.

MA J, YANG Weining, LUO Min, et al. A study of probabilistic password models[C]. 2014 IEEE Symposium on Security and Privacy, San Jose, USA, 2014: 689–704. doi: 10.1109/SP.2014.50.

WANG Ding, ZHANG Zijian, WANG Ping, et al. Targeted online password guessing: An underestimated threat[C]. 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, The Republic of Austria, 2016: 1242–1254. doi: 10.1145/2976749.2978339.

HITAJ B, GASTI P, ATENIESE G, et al. PassGAN: A deep learning approach for password guessing[C]. The 17th International Conference on Applied Cryptography and Network Security, Bogota, Colombia, 2019: 217–237. doi: 10.1007/978-3-030-21568-2_11.

PASQUINI D, GANGWAL A, ATENIESE G, et al. Improving password guessing via representation learning[J]. 2019, arXiv: 1910.04232.

LIU Yunyu, XIA Zhiyang, YI Ping, et al. GENPass: A general deep learning model for password guessing with PCFG rules and adversarial generation[C]. 2018 IEEE International Conference on Communications, Kansas City, USA, 2018: 1–6. doi: 10.1109/ICC.2018.8422243.

XIA Zhiyang, YI Ping, LIU Yunyu, et al. GENPass: A multi-source deep learning model for password guessing[J]. IEEE Transactions on Multimedia, 2020, 22(5): 1323–1332. doi: 10.1109/tmm.2019.2940877

WANG Ding, WANG Ping, HE Debiao, et al. Birthday, name and bifacial-security: Understanding passwords of Chinese web users[C]. The 28th USENIX Security Symposium, Santa Clara, USA, 2019: 1537–1555.

罗敏, 张阳. 一种基于姓名首字母简写结构的口令破解方法[J]. 计算机工程, 2017, 43(1): 188–195, 200. doi: 10.3969/j.issn.1000-3428.2017.01.033

LUO Min and ZHANG Yang. A password cracking method based on name initials shorthand structure[J]. Computer Engineering, 2017, 43(1): 188–195, 200. doi: 10.3969/j.issn.1000-3428.2017.01.033

LI Yue, WANG Haining, and SUN Kun. Personal information in passwords and its security implications[J]. IEEE Transactions on Information Forensics and Security, 2017, 12(10): 2320–2333. doi: 10.1109/TIFS.2017.2705627

汪定. 口令安全关键问题研究[D]. [博士论文], 北京大学, 2017.

施引文献

资源附件(0)

访问统计