读了将近一个下昼的TensorFlow Recurrent Neural Network教程,翻看其在PTB上的实现,感到晦涩难解,是以参考了部分代码,本身写了一个简化版的Language Model,思路借鉴了Keras的LSTM text generation。
代码地址:Github
转载请注明出处:Gaussic
说话模型
Language Model,即竽暌癸言模型,其重要思惟是,在知道前一部分的词典情况下,揣摸出下一?最有可能出现的词。例如,知道了 The fat cat sat>
在知道前面的词为The cat的情况下,下一?词为cat的概率可以推导出来:
- p(The,fat,cat)=p(The)⋅p(fat|The)⋅p(cat|The,fat)p(The,fat,cat)=p(The)·p(fat|The)·p(cat|The,fat)
攫取文件中的数据,将换行符转换为<eos>,然后转换为词典list:
- we 're talking about years ago before anyone heard of asbestos having any questionable properties
- there is no asbestos in our products now
- neither <unk> nor the researchers who studied the workers were aware of any research on smokers of the kent cigarettes
- we have no useful information on whether users are at risk said james a. <unk> of boston 's <unk> cancer institute
- the total of N deaths from malignant <unk> lung cancer and <unk> was far higher than expected the researchers said
分子是The fat cat在语料库中出现的次数,分母是The fat在语料库中出现的次数。是以,The fat cat sat>
- p(cat|The,fat)=p(The,fat,cat)p(The,fat)p(cat|The,fat)=p(The,fat,cat)p(The,fat)
再复杂点,下一?词典出现仅和前面两个词有关,称之为trigram。
- p(S)=p(w1,w2,⋅⋅⋅,wn)=p(w1)⋅p(w2|w1)⋅p(w3|w2)⋅p(w4|w3)⋅⋅⋅p(wn|wn−1)p(S)=p(w1,w2,···,wn)=p(w1)·p(w2|w1)·p(w3|w2)·p(w4|w3)···p(wn|wn−1)
- p(S)=p(w1,w2,⋅⋅⋅,wn)=p(w1)⋅p(w2|w1)⋅p(w3|w1,w2)⋅p(w4|w2,w3)⋅⋅⋅p(wn|wn−2,wn−1)p(S)=p(w1,w2,···,wn)=p(w1)·p(w2|w1)·p(w3|w1,w2)·p(w4|w2,w3)···p(wn|wn−2,wn−1)
推荐阅读
应用网格搜刮或随机搜刮或设备文件来调剂超参数 不要手动检查所有的参数,如许耗时并且低效。我平日对所有参数应用全局设备,检查运行结不雅之后,我拒绝一步研究改进的偏向。如不雅这种办法没有赞助,那么你可以应>>>详细阅读
地址:http://www.17bianji.com/lsqh/36985.html
1/2 1