Web lists-archives.com

Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"




Hi Paul,

On 2019-05-21 23:52, Paul Wise wrote:
> Are there any other case studies we could add?

Anybody is welcome to open an issue and add more
cases to the document. I can dig into them in the
future.

> Has anyone repeated the training of Mozilla DeepSpeech for example?

Generally speaking, training is non-trivial and
requires expensive hardware. This fact will clearly
reduce the probability that "someone has tried to
reproduce it".

A real example to illustrate how hard reproducing a
**giant** model is, is BERT, one of the state-of-the-art
natural language representation model that takes
2 weeks to train on TPU at a cost about $500.

Cite:
https://github.com/google-research/bert#pre-training-tips-and-caveats

> Are deep learning models deterministically and reproducibly trainable?
> If I re-train a model using the exact same input data on different
> (GPU?) hardware will I get the same bits out at the end?

Making the training program reproducible is a good practice to everyone
who train / debug neural networks. I've ever wrote a simple deep
learning
framework with only C++ STL and hence trapped into many pitfalls.
Reproducibility is very important for debugging as mathematical
bug is much harder to diagnose compared to code bugs.

I wrote a dedicated section about reproducibility:
https://salsa.debian.org/lumin/deeplearning-policy#neural-network-reproducibility