Zamia-Brain provides software components, tools and models for semantic natural language processing. Currently it is based on OpenAI GPT-2.

Code for training and inference can be found here:

https://github.com/gooofy/transformer-lm

Available Models

Downloads here:

https://goofy.zamia.org/zamia-speech/gpt2/

Model Size Language Training corpus Vocabulary
gpt2/gpt2-german-345M-r20190906.tar.xz medium (345M) german 4.5 epochs on 27GB twitter+wikipedia+heise+parole 50k sentencepiece
gpt2-german.tar.xz small (117M) german 3 epochs on 27GB twitter+wikipedia+heise+parole 50k sentencepiece

Credits

Massive thanks to Konstantin Lopuhin https://github.com/lopuhin for great code and support!