Deep Learning Vocab EN vs CN 深度学习词汇英中对照

(深度碎片) #1

(This is a wiki post - please edit it to add your own translations)

本页面为fastai v3 2019中文版的一部分


Bilingual subtitle 双语字幕

我致力于为fastai视频提供英中双语字幕(已翻译的2019 part1 的3-5课目前只有中文字幕,后续版本都会补配上同步英文字幕,感谢 @Junmin@LiuYanliang 的建议)。
I am determined to provide bilingual subtitle for fastai lesson videos. So far the previous lesson 3-5 subtitles are only in Chinese, I will add English subtitles to them later. All subsequent subtitles will all be bilingual, both English and Chinese. (thanks to @Junmin and @LiuYanliang for the bilingual subtitle suggestion)

为了方便大家更好理解,在英中字幕里,我会通过 ( )中的内容来补齐 @jeremy 讲演中省略掉的但有助于理解的信息。虽然初衷是帮助大家更好理解,但这么做可能植入我个人的不准确或误导性的理解,所以强烈欢迎大家多多指正和质疑;当然,大家可以直接忽略( )中的内容。
In order to help understand the videos better, in both English and Chinese subtitles I will add additional information inside ( ) based on my understanding of what @jeremy is talking about, as in oral speech some information of a sentence is assumed without saying or assumed in previous sentences. However, it risks to bring in my own bias and misunderstanding. Therefore, you can help to correct my bias and misunderstanding in the subtitle or can simply ignore all of the content inside ( ).

强烈欢迎大家帮助指正字幕中的任何问题,也欢迎质疑和提问!对于大家的纠错与改进的贡献,我都会在下面这个清单中通过 @你 的方式给予credit!
Please help me correct my bias and misunderstanding in the subtitles in reply to this post! Your contribution will be credited in the following list! :heartpulse:

哪一课的字幕 | 时间点链接 | @贡献人 | 更正内容
Lesson Subtitle EN/CN | timeline link | @contributor | Content Correction

字幕问题 questions on subtitles :rescue_worker_helmet:


Fastai EN-CN Vocab 英中词汇表

我计划在翻译课程字幕过程中,和大家一起逐步养成一个fastai 深度学习机器学习中英词汇对照表
I plan to develop a vocabulary translation for fastai deep learning between English and Chinese.

欢迎任何相关贡献,所有贡献者理应得到认可和@!现在这篇文章已经被维基化了,所以大家贡献时可以@自己
Any contribution is welcome and since this post is a wiki now, you are free to @yourself with credit when you contribute. :heartpulse:

如何贡献 How to contribute
  • 帮助翻译有疑难点标注 :rescue_worker_helmet: 的词汇
  • 指出已有词汇翻译中的不足并纠正
  • 指出英文字幕翻译中的

The needed help I can think of at the moment

  • collections of vocab from key DL/ML books and courses are welcome
  • recommendation of existing high quality resources on DL/ML vocab in EN and CN
  • help us with the missing or incorrect translation
  • list the English vocab you want to know the Chinese counterpart
  • list the Chinese vocab you want to know the English counterpart
如何快速在网上查找中文对应词 How I find key vocab translation online
  1. 书内搜索 Search within an open source book
    邱锡鹏深度学习教科书开源
  2. 网站搜索 Search on google
    Google: English term 中文/中文数学
  3. 在线单词表 online vocab list
    机器之心的深度学习词汇,但较长时间未更新
    谷歌开发者机器学习词汇表 , EN version

fastai specific vocab 专有词汇 英中对照

:white_check_mark: = 官方认可 approved by fastai

crappify (翻译建议采集中translation options : 1. 垃圾化;2. 残次化)
DataBunch 数据堆 :white_check_mark:
discriminative learning rate (翻译建议采集中translation options :1. 判别学习率; 2. 区别学习率)


fastai lesson video vocab 课程字幕词汇 英中对照

:rescue_worker_helmet: = 需要救助 need help

第六课 Lesson 6

weight tying 权重拴连 :rescue_worker_helmet:
channel 通道
convolution kernel 卷积核
reflection mode 反射模式
padding mode 填充模式
dihedral 二面角
weight norm 权重归一
Covariate Shift 协变量偏移
Batch Normalization BN 批量归一化
instance 实例化
module 模块
dropout mask 随机失活的掩码
Beroulli trial 伯努利实验
test time / inference time 测试期/预测期
training time 训练期
spiking neurons 脉冲神经元
tabular learner 表格学习器
long tail distributions 长尾分布
Root Mean Squared Percentage Error 均方根百分比误差
nomenclature 名称系统
cardinality 集合元素数量
preprocessors 预处理
RMSPE (root mean squared percentage error) 均方根百分比误差
computer vision 机器视觉
projection 投射

第五课 Lesson 5 vocab

MAPE mean absolute percentage error 平均绝对百分比误差 感谢 @thousfeet
super-convergence 超级收敛
dynamic learning rate 动态学习率
exponentially weighted moving average 指数加权移动平均值

更多词汇 more vocab

epoch 迭代次数
finite differencing 有限差分法
analytic solution 解析解
convergence 收敛
divergence 散度
L2 regularization L2正则化
learning rate annealing 学习率退火
element-wise function 元素逐一函数, 感谢与 @Moody 的探讨
logistic regression model 逻辑回归模型
flatten 整平 (numpy.arrays)
actuals 目标真实值
constructor 构造函数
generalization 泛化
2nd degree polynomial 这是2次多项式
Gradient Boosted Trees 梯度提升树
Entity Embeddings 实体嵌入
NLL (negative log likelihood) 负对数似然
PCA (Principal Component Analysis) 主成分分析
weight decay 权值衰减
benchmark 基准
cross-validation 交叉验证
latent factors 潜在因子
array lookup 数组查找
one-hot encoding one-hot编码,或者一位有效编码,(或者 独热编码 感谢 @LiuYanliang
Dimensions 维度
transpose 转置矩阵处理
Convolutions 卷积
Affine functions 仿射函数
Batch Normalization 批量归一化
multiplicatively equal 乘数分布相同 (每层都10倍递增/减 1e-5, 1e-4, 1e-3)
diagnal edges 对角线边角
filter 过滤器
target 目标值
softmax softmax函数 (转化成概率值的激活函数)
backpropagation 反向传递
Universal Approximation Theorem 通用近似定理
weight tensors 参数张量
input activations 输入激活值

第四课 Lesson 4

mask 掩码
matrix multiplication 矩阵乘法
dot product vs matrix product (单个数组和单个数组的乘法 = 点积,矩阵(多数组)与矩阵(多数组)的乘法 = 矩阵乘法)当前字幕版本对这两个词混淆使用了(全用了"点积"这个词),下个版本会做修正。
unfreeze 解冻模型
freeze 封冻模型
cross-entropy 交叉熵
scaled sigmoid 被放缩的S函数

更多词汇 more vocab

layers 层
activations 激活值/层
parameters 参数
Rectified Linear Unit, ReLU 线性整流函数,(或者修正线性激活函数 感谢 @LiuYanliang
non-linear activation functions 非线性激活函数
nonlinearities 非线性激活函数
sigmoid S函数
bias vector 偏差数组 (或者 偏置向量 感谢 @LiuYanliang
embedding matrix 嵌入矩阵
bias 偏差
dropout 随机失活 (感谢 @Junmin ),或者 丢弃法
root mean squared error(RMSE)均方根误差
mean squared error(MSE)均方误差
sum squared error 残差平方和
vector 数组
spreadsheet 电子表格
dot product 点积
state of the art 最先进的
time series 时间序列
cold start problem 冷启动问题
timestamp 时间戳
sparse matrix 稀疏矩阵
collaborative filtering 协同过滤
metrics 度量函数/评估工具
end-to-end training 端到端训练
fully connected layer 全联接层
meta data 元数据
tabular learner 表格数据学习器
dependent variable 应变量
data augmentation 数据增强
processes 预处理
transforms 变形处理/设置
categorical variable 类别变量
continuous variable 连续变量
feature engineering 特征工程
gradient boosting machines 梯度提升器
hyperparameters 超参数
random forest 随机森林
discriminative learning rate 判别学习率
tabular data 表格数据
momentum 动量
decoder 解码器
encoder 编码器
accuracy 精度
convergence 收敛
overfitting 过拟合
underfitting 欠拟合
inputs 输入值
weight matrix 参数数组
matrix multiply 数组相乘
Tokenization 分词化
Numericalization 数值化
Learner 学习器 (感谢 @stas 学习器与模型的内涵对比
target corpus 目标文本数据集
Supervised Learning/models 监督学习/模型
Self-Supervised Learning 自监督学习
pretrained model 预训练模型
fine-tuning 微调

第三课 Lesson 3

independent variable 自变量 (感谢 @Junmin 的指正)

Image Classification 图片分类
Image Segmentation 图片分割
Image Regression 图片回归
CNN Convolution Neural Network 卷积神经网络
RNN Recurrent Neural Network 循环神经网络
NLP Natural Language Processing 自然语言处理
language model 语言模型

课程其他词汇 vocab not DL specific

take it with a slight grain of salt 不可全信
come out of left field 不常见的
elapsed time 所经历的时间 :rescue_worker_helmet: 出现时间点
connoisseur 鉴赏级别/专业类电影
nomenclature 专业术语
rule of thumb 经验法则
asymtote 渐进
delimiter 分隔符
enter 回车键
macro 宏
unwieldy 困难
infuriating 特别烦人
hone in on it 精确定位目标
hand waving 用手做比划/解释
string 字符串
list 序列


机器学习基础词汇 ML vocab from elsewhere


以下是我之前从周志华西瓜书收集的一小部分词汇对照。
期待有小伙伴能将周志华的西瓜书和Goodfellow的花书中的词汇对照整理出来 :heartpulse:

深度学习 deep learning
机器学习 machine learning
学习算法 learning algorithm
模型 model
数据集 data set
示例 instance 样本 sample
属性 attribute 特征 feature
属性值 attribute value
样本空间 sample space
特征向量 feature vector
维度数量 dimensionality

学习 learning 训练 training
训练数据 training data
训练样本 training sample, training example
训练集 training set
假设 hypothesis
真相 ground-truth
学习器 learner = model

预测 prediction
标记 label
样例 example
标记空间 输出空间 label space

分类 classification
回归 regression
二分类 binary classification
正类 positive class
反类 negative class
多分类 multi-class classification

测试 testing
测试样本 testing sample

聚类 clustering
簇 cluster

cluster = 没有标记下的分类,通过挖掘数据结构特征发现的
class = 给定标记的分类,事先给定的

监督学习 supervised learning
无监督学习 unsupervised learning

泛化能力 generalization
分布 distribution
独立同分布 independent and identically distributed i.i.d.

3 Likes

(Jeremy Howard (Admin)) #2

Great idea! I’ve wikified (怎么说?) this post so hopefully it can be a central resource for DL 中文 vocab.

3 Likes

(深度碎片) #3

My proposal would be
Wikified 维基化

0 Likes

(Jeremy Howard (Admin)) #4

Ah yes that’s perfect. I love the 化 suffix - works in so many places :slight_smile:

1 Like

(深度碎片) #5

I would like to give credit to people for every bit they contribute here. So for someone’s contribution I added “thanks to @somebody”.

Thanks to @jeremy for making this post wikified.

Now it is a wiki so you have the choice to add credit for yourselves, it is all up to each of you contributors.

Happy adding translations! :sunny:

1 Like

(深度碎片) #6

Besides for a DL 中文 vocab, I just realized there is an interesting task is hidden here:

As fastai has not yet been extremely popular in China (but I believe it will be in near future) now, I don’t think people have the chance to come up any translation for fastai specific terms yet. For example, for dataset we have 数据集 as nearly common term for anyone to know, I don’t know how the Chinese academics would name DataBunch, or maybe only you and the fastai people have the right to determine its Chinese counterpart, or maybe no Chinese counterpart is necessary for DataBunch (but for crappify, I am sure we need a translation such as 垃圾化). What do you think?

Therefore, would you be interested to guide the translation style for fastai specific terms?

0 Likes

(Jeremy Howard (Admin)) #7

I would love that! My Chinese is too rusty to suggest a direction, but hopefully adequate to provide feedback.

When I thought of crappify I thought of 垃圾化 too… although I don’t know what other words in Chinese have an association with “crappy”.

Would DataBunch be 数据堆 ? Or 数据束? Something else?..

1 Like

(深度碎片) #8

Well, your Chinese is beating me already, it takes me quite a while to come up this second one 残次化. “残”:(it feels like) missing, crippled, disabled, falling part; “次”:coming from the term “次品”, as product not meeting quality standard when inspected on manufacture line.

There could be more or better options but probably way beyond my vocab limit.

the two options above are fine with me, but I believe it’s the respected and beloved fastai creators’ pronunciation or announcement of its name really gives the life to it.

1 Like

(深度碎片) #9

Both sound perfect to me!

I guess maybe you have to declare it first.

Do you think we should add a section in the wiki for fastai created terms? People can suggest options, and you make the final official call for settling the translation?

0 Likes

(深度碎片) #10

When to use the term Learner vs model?

Previously, I tend to use the term model all the time referring everything from the architecture of a NN, to all the parameters as a whole undergoing training, to the finished model ready to predict.

After watch the videos a few times, there seems much better to differentiate model from learner in the following way:

  • model only refers to the neural network architecture, the framework, maybe including parameters before training
  • learner only refers to all parameters with certain architecture undergoing training to learn to do a task well
  • therefore, when we talk about overfitting and underfitting, we are referring to learner rather than model

Is this the preferred way of using learner and model for fastai world?

@jeremy @stas

0 Likes

(Stas Bekman) #11
  • model contains the layers of the network and its weights/biases, and some status flags (e.g. trainable) - it’s the same in any DL framework.
  • Learner (and its subclasses) is a fastai “feature” that binds together data and model, optimizer, etc. and makes it easy to do almost everything a DL practitioner needs through a single interface.

So a good metaphor would be a bus - Learner is the driver console, model is perhaps the engine, and there are many other parts to the bus. And you Daniel are its driver that controls them all via that console that is Learner.

1 Like

(Jeremy Howard (Admin)) #12

Sure!

2 Likes

(深度碎片) #13

Thank you @stas for this great comparison!

1 Like

(Jeremy Howard (Admin)) #14

I’ll suggest 数据堆 then. Since it’s really more a pile-of-data than a carefully curated bouquet… :slight_smile:

1 Like

(深度碎片) #15

totally agree! and I will update it in the wiki now

1 Like

(Gong Junmin) #16

dropout 可以翻译为随机失活,我见到过有些字幕组是这样翻译的,贴切它的实际功能

1 Like

(Sarada Lee) #17

@Daniel thanks for all the hard work. 你覺得這翻譯可以嗎? element-wise function = 個別元素函數

1 Like

(深度碎片) #18

谢谢 @Moody
个别元素函数,是个不错的选择,重点突出了每一个元素分别通过函数来生成激活值。
在思考回复时,还想到一个选择:基于element-wise与clock-wise的相似性,也许也可以翻译成:element-wise function = 元素逐一函数 (想表达每个元素一个一个分别通过函数处理的意思)
你怎么看呢?

0 Likes

(Sarada Lee) #19

I’m glad I inspired you to finalise the last vocabulary. I also checked with BaiDu, the result for element-wise is 逐個, I find it is a bit inform. Yours is good.
Congratulations again for translating the v3 Part 1 with all the supplementary documents. You are a star. :blush:

1 Like

(深度碎片) #20

Thanks @Moody

The discussion with you are very helpful and encouraging, and I am glad you like 逐一,and I will update it soon.

Thanks again for your encouragement, but I am far from a star in our fastai community. I really love what fast.ai and this forum are building and sharing, and overwhelmed by the huge amount treasure generated here. I guess translation turns out to be the perfect way for me to calm down and take things in slowly and contribute back at the same time.

Also without the support and help from you and others on the forum, it will be much harder to keep going.

1 Like