Skip to content

Rasmus master thesis ~ LLM-based Process Constraints Generation with Context

License

Notifications You must be signed in to change notification settings

Jakobsson2001/xSemAD

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-based Process Constraints Generation with Context

This repository contains the prototype of the approach and evaluation as described in LLM-based Process Constraints Generation with Context.

About the project

When analyzing the data generated by complex information systems, identifying undesirable behaviors in event log traces---so-called conformance checking---is a key challenge. With the rise of deep learning and, more specifically, generative AI applications, one promising line of research is the auto-generation of (symbolic) temporal reasoning queries that can then be applied in a semi-automatic manner. A recent work has shown that utilizing fine-tuned open-source large language models (LLMs) is promising and in some aspects superior to other approaches to automated conformance checking. This thesis will further expand this research direction, by supporting the provision of process-specific natural language context and expanding the practical expressivity of the generated constraints.

Built with

  • platform
  • GPU
  • python

Requirements

To apply our approach (xSemAD)

Use at least python

  1. clone this project git clone to get the repository
  2. install the requirements using following CLI
./install.sh

Make sure to adapt all path file names to your needs.

Project Organization

├── install.sh                                       <- Script used to install all requirements and data to be able to run pipeline.
├── constraints-transformer
│   ├── conversion                                   <- Contains bpmn analyzer, json2petrinet, and petrinet analyzer.
│   ├── data                                         <- Contains data used to run pipeline, and later results generated. (Created after running install.sh script)
│   ├── evaluation                                   <- contains utility functions for evaluation.
│   ├── labelparser                                  <- contains utility functions for label parsing.
│   ├── results                                      <- Figures for the paper.
│   ├ 01_run_paper_preprocessing_data.py             <- Script to generate train,test, and validatrion set
│   ├ 02_run_training.py                             <- Script to fine-tune FLAN-T5 
│   ├ 04_run_generate_predictions_testset.py         <- Script to generate xSemAD predictions on testset
│   ├ declare_to_textdesc.py                         <- Converts DECLARE constraints into text descriptions
│   ├ vanilla_llm_predictions.py                     <- Generates predictions using non-fine-tuned GPT
│   ├ random_predictions.py                          <- Generates random constraint predictions as a baseline
│   ├ 10_paper_paper_results.py                      <- Script go generate the PAPER RESULTS
│   ├ config.py                                      <- config file 
│   └ requirements.txt                               <- requirements file
├── README.md                                        <- The top-level README for users of this project.
└── LICENSE                                          <- License that applies to the source code in this repository.

User Guide: Running the Pipeline

The files in this repository are generally run in the order they appear in the project structure. Below is a guide for running the files:

  1. Start with preprocessing the data:

    python3 constraints-transformer/01_run_paper_preprocessing_data.py
  2. Run each subsequent file in the order they appear: Follow the project structure and execute the files step-by-step.

  3. Continue through the pipeline: Always check for optional arguments or configurations using -h before running a file.

Files with Arguments

Some files can be run with additional arguments or configurations. To learn about the available options for these files, use the -h (help) flag. If the file doesn’t take any arguments, it will start execution instead. Example:

python3 constraints-transformer/vanilla_llm_predictions.py -h

This will display available arguments and their descriptions for the file. Feel free to refer to the help command for guidance on any specific script’s functionality or configuration options. Or contact the authors.

Contact

Find a bug?

If you found an issue or would like to submit an improvement to this project, please contact the authors.

About

Rasmus master thesis ~ LLM-based Process Constraints Generation with Context

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 63.7%
  • Python 34.7%
  • Shell 1.6%