Qualitative discourse analysis is crucial for social scientists studying human interaction. This project leverages large language models (LLMs) to enhance qualitative discourse analysis, a task traditionally requiring high inter-rater reliability among human coders. This is an exceedingly labor-intensive task, requiring human coders to fully understand the discussion context, consider each participant’s perspective, and comprehend the sentence’s associations with the previous discussion, as well as shared general knowledge.
The goal is to develop a model capable of categorizing postings in online discussions, such as those in a corpus discussing "The Lady, or the Tiger?" story, but capable of generalization.
Our approach incorporates multiple features to identify topic shifts driven by individual users. We fine-tuned multiple LLMs models, like LLama and Mistral, using the LoRA technique to optimize training efficiency and defined a generic prompt, adaptable for both models, that includes chat history, context from relevant articles or stories, and a codebook of labels and examples.
Finally, an ensemble approach combined predictions from multiple models, with the final model using few-shot learning to select the best prediction. To ensure explainability, we generated textual explanations with LLaMA, making the model's decisions accessible to non-expert users while avoiding hallucinations.