发布于 2023-02-13
CS224N Lecture 10: Transformers and Pretraining
Subword Modeling 在之前的学习中,我们都以有限的词汇作为基本假设来训练模型。在遇到“字典”里没出现过的词语时,我们 …
Subword Modeling 在之前的学习中,我们都以有限的词汇作为基本假设来训练模型。在遇到“字典”里没出现过的词语时,我们 …
Issues with RNN models Linear interaction distance: words that sh …
为了提高效率,模型在处理句子(embeddings)时并不是一句一句进行的,而是以 batch 为单位批量处理。但一个 batch …
For language which there isn’t much parallel data available, comm …
Training a RNN Language Model At each step, the model have the pr …
Context-Free Grammars (CFGs) Also called constituency or phrases …
本章主要内容是导函数的计算以及 backpropagation 算法的概念和步骤 Rember: Stochastic Gradi …
Review: Main idea of word2vec Start with random word vectors Iter …
Assignment 1 Word Vectors Part 1 Count-Based Word Vectors Constru …