The Zamia Brain project provides infrastructure useful to create natural language processing systems based on transformer networks (see ).

This project is still highly experimental, everything is subject to change without prior notice. The current approach is to generate training corpora for pre-training as well as (multi-)domain refinement. The goal is to train networks that are very robust (i.e. avoid brittleness present in traditional rule-based systems) in their natural language processing capabilities (pretraining) while allowing for a certain amount of control of their behavior (refinement).

For this, you will find these components:

Available Models

Downloads here:

Model Size Language Training corpus Vocabulary
gpt2-german-345M-r20191119 345M german 10 epochs on 27GB twitter+wikipedia+heise+parole 50k sentencepiece
transformerXL-german-163M-r20190928 163M german 1 epochs on 27GB twitter+wikipedia+heise+parole 50k sentencepiece

Model Dataflow

GPT-2 Dataflow


Massive thanks to Konstantin Lopuhin for great code and support!