人 民 网 版 权 所 有 ,未 经 书 面 授 权 禁 止 使 用
To fill in the blanks, you need to understand a lot about audio: what speech sounds like, how harmonics relate to fundamentals, how consonants transition into vowels. The model does learn useful representations this way. But there is a catch: it has to predict exact spectrogram values. It spends a lot of its memory on remembering the exact room reverberation, microphone quirks, and other noise that aren’t important for the downstream translation model. It shifts the model’s task from learning representations to performing reconstruction.
,更多细节参见有道翻译
mask = distance max(18.0, float(threshold) * 0.9)
There is an underlying tension between the predictions of generally intelligent systems that can replace much of human cognitive labor and the money AI labs are actually spending on data to automate one task at a time. It is the difference between a future of abrupt mass unemployment and something more subtle but potentially just as disruptive: a future in which a growing number of people find work teaching AI to do the work they once did. The first wave of these workers consists of software engineers, graphic designers, writers, and other professionals in fields where the new training techniques are proving effective. They find themselves in a surreal situation, competing for precarious gigs pantomiming the careers they’d hoped to have.
"cachedChromeExtensionInstalled": false,