Skip to content

Extending dataset with ChatGPT and training with Chain Of Thoughts

Notifications You must be signed in to change notification settings

preste-ai/rnd-nlp-cot-chatgptdatagen

Repository files navigation

Readme file

In 100%_synthetic_dataset_generation.ipynb you will find a detailed instruction on how to generate a synthetic dataset to make it from scratch if you don't have any data at all. In 90%_synthetic_dataset_generation.ipynb you will find a detailed instruction on how to generate a synthetic dataset to enreach your data and to train your models.

In SVM_with_rationals document we described a training process for the SVM model to train it with CoT approach using our previously generated dataset. In SVM_without_rationals document we described a classic training process for the SVM model to train it without CoT approach using our previously generated dataset.

In Flaubert_with_rationals you will find a fine-tuning process for Flaubert model using CoT approach using our previously generated dataset. In Flaubert_without_rationals you will find a fine-tuning process for Flaubert model without CoT approach using our previously generated dataset.

In chatGPT document we apply LLM on our test set in order to gather metrics and compare it with our previously mentionned models. In zero-shot classification document we apply zero-shot model on our test set in order to gather metrics and compare it with our previously mentionned models.

About

Extending dataset with ChatGPT and training with Chain Of Thoughts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published