This repository serves as a generator for the LogiNumBench benchmark. Pre-generated examples can be found at data
. Below, you will find a brief introduction to help you generate your own data.
You can directly use demo.py
for generation; you only need to control the parameters passed in. What's more, we provide js2disk.py
to transform the json file to disk file for datasets library.
To create your custom dataset, first load the template configuration file (you can also add your own), then use the Theory
class. An instance of this class will automatically handle the sample generation process. Simply pass the desired parameters as follows:
from LogiNumBench import Theory, RandomGene
import json
with open("config.json") as jsf:
obj = json.load(jsf)
RandomGene.load_config(obj)
while True:
theory = Theory(id,entityNum, attrNum, factNum, ruleNum, relationNum, depth)
if theory.assertion is not None:
break
id
: the id for this sample, pass it for generating labeled data in subsequent steps**Num
: the quantity of ** you want of the datadepth
: the reasoning depth you want of the data
It is essential to note that the depth is intricately linked with the preceding conditions. For instance, a limited quantity of facts and rules may constrain the depth of inference. So, with the parameters you select, there may not exist the reasoning depth you want. In this case, theory.assertion
will be set to None
because there is no suitable object for assertion. Since the process is random, you can regenerate until it is suitable. However, please note that the parameters you choose may never be able to generate the reasoning depth you desire.
the parameters we chose are listed below:
depth | entityNum | attrNum | factNum | ruleNum | relationNum |
---|---|---|---|---|---|
1 | 4 | 4 | 6 | 6 | 3 |
2 | 4 | 4 | 6 | 6 | 3 |
3 | 4 | 4 | 6 | 6 | 3 |
4 | 5 | 6 | 7 | 7 | 4 |
5 | 5 | 6 | 8 | 8 | 5 |
6 | 5 | 6 | 8 | 10 | 5 |
After generating a Theory instance, you can convert it into different forms.
str(theory)
transforms it into a formal descriptiontheory.nl()
provides a natural language description of input and targettheory.to_json()
yields a dictionary representation of the components
For example:
>>> theory = Theory(1, entityNum=4, attrNum=4, factNum=6, ruleNum=6, relationNum=3, depth=3)
>>> str(theory)
"ID: 1\nFact:\n0. argues(Grace, James)\n1. argues(Fiona, Grace)\n2. eats(James, Grace)\n3. is(Fiona, considerate, 1)\n4. eats(Grace, Gary)\n5. argues(Grace, Gary)\nRule:\n0. is(?x, big, 3) -> is(?x, quiet, 2)\n1. hugs(?x, ?y) -> is(?x, quiet, 4y+1)\n2. is(?x, cold, 3) -> is(?x, big, 1)\n3. eats(?x, ?y) -> is(?x, considerate, 7y+2)\n4. is(?x, considerate, 2) -> is(?x, cold, 3)\n5. is(?x, big, 4) -> is(?x, quiet, 2)\nAssertion:\nless(James, cold, 17)\nAnswer:\nTrue\nOne reasoning:\n['Rule4', ['Fact2', 'Rule3', ['Fact4', 'Rule3', ['Default']]]]\nStep reasoning:\nDefault -> int0: Gary is 0 considerate; Fact4 & Rule3 & int0 -> int1: Grace is 7*0+2=2 considerate; Fact2 & Rule3 & int1 -> int2: James is 7*2+2=16 considerate; Rule4 & int2 -> int3: James is 3 cold\n"
>>> theory.nl()
"Fact:\n0. The relationship argues exists between Grace and James.\n1. Fiona is in a state of argues with respect to Grace.\n2. In the context of eats, James and Grace share a connection.\n3. The value 1 is assigned to Fiona for the considerate field.\n4. There is a relation eats between Grace and Gary.\n5. Grace and Gary maintain a connection defined by argues.\nRule:\n0. For x, quiet is represented by the value 2 is a direct result of in x, big is registered as having 3.\n1. The quiet property of x is marked with scaling up the feature quiet of y 4 times and then adding 1 is a foregone conclusion given x and y form a connection of the hugs relationship.\n2. X is defined by 1 in the big field is a natural consequence of within x, the value of cold is 3 being true.\n3. If the eats correlation is present between x and y holds true, it follows that the considerate characteristic of x is taking the characteristic considerate of y and increasing it 7 times with an addition of 2 is inevitable.\n4. If for x, considerate is annotated as 2 holds true, it follows that x is characterized by having 3 in the cold attribute is inevitable.\n5. The value 2 is associated with x under the quiet context can be safely inferred from 4 is attributed to x under big.\nAssertion:\nJames's cold is less than 17.\nStep reasoning:\nDefault -> int0: Gary is 0 considerate; Fact4 & Rule3 & int0 -> int1: Grace is 7*0+2=2 considerate; Fact2 & Rule3 & int1 -> int2: James is 7*2+2=16 considerate; Rule4 & int2 -> int3: James is 3 cold\nAnswer:\nTrue\n"
In order to better understand the sample generated, even if theory.assertion
is None
, you can gain insights into this randomly generated sample through the following method. At the same time, you can also assess the reasonableness of certain parameter.
>>> theory.states
[[1, 0, 3, 0], [0, 0, 0, 0], [0, 0, 0, 90], [0, 0, 0, 9]]
>>> theory.reasoner
[[[('F', 3)], [('F', -1)], [('R', 3), ('S', 0, 0)], [('F', -1)]], [[('F', -1)], [('F', -1)], [('F', -1)], [('F', -1)]], [[('F', -1)], [('F', -1)], [('F', -1)], [('F', 5), ('R', 0), ('S', 3, 3)]], [[('F', -1)], [('F', -1)], [('F', -1)], [('F', 1), ('R', 0), ('S', 0, 3)]]]
theory.states[i][j]
stores the value of attribute j for entity i.
theory.reasoner[i][j]
stores the source of the value of attribute j for entity i, representing the last step of reasoning that led to obtaining this value. Specifically, there are three forms of the reasoner.
[('F',-1)]
represents that the reasoning source is defaulting to 0.[('F',i)]
signifies that the value directly comes from the specified fact i.[('R',i),('S',j,k)]
indicates that the value is derived from the inference of rule j and the value of attribute k for entity j.[('F',i),('R',j),('S',k,l)]
indicates that the value is derived from the inference of fact i and rule j, and is calculated based on the value of attribute l for entity k.
Now you can easily obtain any information about this sample!
You can add expression by modify the class Expr
in LogiNumBench.py
.
- change the
__init__
to generate your mode - complete
compute
,__str__
,nl
,expression_str
functions as the linear operation we have finished
If you want to change the assertion value generated, you can just change the code 'num': RandomGene.geneFromInterval(1, 100)
at class Assertion
.
The baselines
directory contains code for training and evaluating various model architectures. Detailed instructions are provided below.
- torch >= 2.0
- transformers
- zhipuai
- openai
The encoder_only_train.py
file is used to train the decoder-only models. It trains 12 tasks at once, including D1-6 and LD1-6.
The encoder_only_eva.py
file is used to evaluate the decoder-only models. The parameters used are set within the file. The config file is used to provide the checkpoint file after training. The format is as follows:
{
"albert_xlarge":
{
"D1":
{
"output":22000,
},
"D2":
{
"output":14000
}
}
}
It is recommended to modify the code to adapt to your specific file paths and environment settings. You can use command like the following :
python encoder_only_train.py gpu=0 ckpt="the path of pre-trained model" dataFile=/"your path"/D{depth}/disk storeFile=/"your path"/D{depth} batchsz="batch size" lr="learning rate"
python encoder_only_eva.py
Similar to the above, the encoder_decoder_train.py
file is used to train the encoder-decoder models. It trains 12 tasks at once, including D1-6 and LD1-6.
The encoder_decoder_eva.py
file is used to evaluate the encoder-decoder models. The parameters used are set within the file. The config file is used to provide the checkpoint file after training. The format is as follows:
{
"t5_base": {
"D1": 8000,
"D2": 8000,
"D3": 12000
}
}
It is recommended to modify the code to adapt to your specific file paths and environment settings. You can use command like the following :
python encoder_decoder_train.py gpu=0 ckpt="the path of pre-trained model" dataFile=/"your path"/D{depth}/disk storeFile=/"your path"/D{depth} batchsz="batch size" lr="learning rate"
python encoder_decoder_eva.py
Please configure paths carefully in the files to suit your environment.
-
ChatGLM3: To call the API, please refer to the official repository for instructions on running the server.
-
Llama2: To call the API, please refer to the official repository for instructions on running the server.
-
Llama3
-
Llama3.1
-
Gemma
-
Qwen1.5
-
Qwen2
-
ChatGPT & GPT-4o
- create_mix.py: The training set we used, with mixed depths, was collected from D1 to D5 for pre-training.
- pretrain.py: Executes the pre-training.
- RT_after_pretrain.py: Fine-tunes on RuleTaker after pre-training.
- train_on_RT.py: Directly fine-tunes on RuleTaker.