Skip to content

This repository publishes poetic texts in German generated by character-based recurrent neural network

License

Notifications You must be signed in to change notification settings

nevmenandr/german-generated-poetic-texts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

German Generated Poetic Texts - GGPT

Goal and content

This repository publishes poetic texts in German generated by character-based recurrent neural networks.

At first sight, it seems pointless to publish computer-generated poetic texts, since the computer can generate such texts in infinite numbers. In fact, however, such publication proves to be useful. For example, researchers may need generated texts for analysis, and it is difficult to obtain such texts quickly. The models that generate them require software customization, require certain versions of deep learning frameworks. The availability of trained models itself can also be questionable. These problems can cause a lot of headaches. In addition, the generation process may require special technical skills. This limits the work with such texts to scholars of the humanities.

This repository contains ready-to-use texts.

The models are trained on texts of German Hexameter, on poetry by Friedrich Hölderlin, Theodor Fontane and Paul Celan.

One model was trained on the Celan texts two on the Fontane and Hölderlin texts, and three on the Hexameter texts. Each model has been trained for its own number of epochs and has its own loss value.

Ten samples at least 28,000 characters in length were generated for each model and are presented in this repository.

Neural network architecture

Models were trained with the code developed by Andrej Karpathy for character-based multi-layer Recurrent Neural Networks (LSTM) in Torch.

Train sets

Train corpus Characters Lines
Hölderlin 415,516 10,677
Fontane 365,360 10,327
Celan 267,521 9,757
Hexameter 605,627 12,516

Hölderlin's poems were crawled from this web site.

Hexameter lines extracted from large collection of German verses running by Thomas Haider.

Data

Ten samples with different temperature were generated for each model. For an explanation of the temperature concept, see the original Karpathy repository.

Train Epoch Loss Temperature
Hölderlin 43.75 1.3026 0.1
Hölderlin 43.75 1.3026 0.2
Hölderlin 43.75 1.3026 0.3
Hölderlin 43.75 1.3026 0.4
Hölderlin 43.75 1.3026 0.5
Hölderlin 43.75 1.3026 0.6
Hölderlin 43.75 1.3026 0.7
Hölderlin 43.75 1.3026 0.8
Hölderlin 43.75 1.3026 0.9
Hölderlin 43.75 1.3026 1.0
Hölderlin 50.00 1.3049 0.1
Hölderlin 50.00 1.3049 0.2
Hölderlin 50.00 1.3049 0.3
Hölderlin 50.00 1.3049 0.4
Hölderlin 50.00 1.3049 0.5
Hölderlin 50.00 1.3049 0.6
Hölderlin 50.00 1.3049 0.7
Hölderlin 50.00 1.3049 0.8
Hölderlin 50.00 1.3049 0.9
Hölderlin 50.00 1.3049 1.0
Fontane 42.25 1.4736 0.1
Fontane 42.25 1.4736 0.2
Fontane 42.25 1.4736 0.3
Fontane 42.25 1.4736 0.4
Fontane 42.25 1.4736 0.5
Fontane 42.25 1.4736 0.6
Fontane 42.25 1.4736 0.7
Fontane 42.25 1.4736 0.8
Fontane 42.25 1.4736 0.9
Fontane 42.25 1.4736 1.0
Fontane 80.00 1.5189 0.1
Fontane 80.00 1.5189 0.2
Fontane 80.00 1.5189 0.3
Fontane 80.00 1.5189 0.4
Fontane 80.00 1.5189 0.5
Fontane 80.00 1.5189 0.6
Fontane 80.00 1.5189 0.7
Fontane 80.00 1.5189 0.8
Fontane 80.00 1.5189 0.9
Fontane 80.00 1.5189 1.0
Celan 46.30 1.5115 0.1
Celan 46.30 1.5115 0.2
Celan 46.30 1.5115 0.3
Celan 46.30 1.5115 0.4
Celan 46.30 1.5115 0.5
Celan 46.30 1.5115 0.6
Celan 46.30 1.5115 0.7
Celan 46.30 1.5115 0.8
Celan 46.30 1.5115 0.9
Celan 46.30 1.5115 1.0
hexameter 14.34 1.3988 0.1
hexameter 14.34 1.3988 0.2
hexameter 14.34 1.3988 0.3
hexameter 14.34 1.3988 0.4
hexameter 14.34 1.3988 0.5
hexameter 14.34 1.3988 0.6
hexameter 14.34 1.3988 0.7
hexameter 14.34 1.3988 0.8
hexameter 14.34 1.3988 0.9
hexameter 14.34 1.3988 1.0
hexameter 43.01 1.3479 0.1
hexameter 43.01 1.3479 0.2
hexameter 43.01 1.3479 0.3
hexameter 43.01 1.3479 0.4
hexameter 43.01 1.3479 0.5
hexameter 43.01 1.3479 0.6
hexameter 43.01 1.3479 0.7
hexameter 43.01 1.3479 0.8
hexameter 43.01 1.3479 0.9
hexameter 43.01 1.3479 1.0
hexameter 80.00 1.3702 0.1
hexameter 80.00 1.3702 0.2
hexameter 80.00 1.3702 0.3
hexameter 80.00 1.3702 0.4
hexameter 80.00 1.3702 0.5
hexameter 80.00 1.3702 0.6
hexameter 80.00 1.3702 0.7
hexameter 80.00 1.3702 0.8
hexameter 80.00 1.3702 0.9
hexameter 80.00 1.3702 1.0

See also metadata in TSV format.

Papers

Hölderlin generation was made for the poet's anniversary in 2020. See paper.

Models

Models are published on huggigface:

Citation

If you found this repository useful, please cite it with the URL.

@misc{orekhovboris2020ggpt,
    author = {Orekhov, Boris},
    month = sep,
    title = {{German Generated Poetic Texts - GGPT}},
    url = {https://github.com/nevmenandr/german-generated-poetic-texts},
    year = {2022}
}

About

This repository publishes poetic texts in German generated by character-based recurrent neural network

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published