Neural Cognitive Diagnosis for Intelligent Education Systems

Source code and data set for the paper Neural Cognitive Diagnosis for Intelligent Education Systems and NeuralCD: A General Cogntive Diagnosis Framework.

The code in this repository is the implementation of NeuralCDM model, and the data set is the public data set ASSIST2009-2010.

If this code helps with your studies, please kindly cite the following publication:

@article{wang2020neural,
  title={Neural Cognitive Diagnosis for Intelligent Education Systems},
  author={Wang, Fei and Liu, Qi and Chen, Enhong and Huang, Zhenya and Chen, Yuying and Yin, Yu and Huang, Zai and Wang, Shijin},
  booktitle={Thirty-Fourth AAAI Conference on Artificial Intelligence},
  year={2020}
}

or

@article{wang2022neuralcd,
  title={NeuralCD: A General Framework for Cognitive Diagnosis},
  author={Wang, Fei and Liu, Qi and Chen, Enhong and Huang, Zhenya and Yin, Yu and Wang, Shijin and Su, Yu},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2022},
  publisher={IEEE}
}

For the implementation of NeuralCDM+, please refer to https://github.com/LegionKing/NeuralCDM_plus .

For the implementation of KaNCD (in "NeuralCD: A General Cogntive Diagnosis Framework") and other typical cognitive diagnosis models, please refer to our github repository: https://github.com/bigdata-ustc/EduCDM .

Dependencies:

python 3.6
pytorch >= 1.0 (pytorch 0.4 might be OK but pytorch<0.4 is not applicable)
numpy
json
sklearn

Usage

Run divide_data.py to divide the original data set data/log_data.json into train set, validation set and test set. The data/ folder has already contained divided data so this step can be skipped.

python divide_data.py

Train the model:

python train.py {device} {epoch}

For example:

python train.py cuda:0 5 or python train.py cpu 5

Test the trained the model on the test set:

python predict.py {epoch}

Data Set

The data/log_data.json is extracted from public data set ASSIST2009-2010 (skill-builder, corrected version) where answer_type != 'open_response' and skill_id != ''. When a user answers a problem for multiple times, only the first time is kept. The logs are organized in the structure:

log_data.json = [user1, user2, ...]
user = {"user_id": user_id, "log_num": log_num, "logs": [log1, log2, ...]]}
log = {"exer_id": exer_id, "score": score, "knowledge_code": [knowledge_code1, knowledge_code1, ...]}

The user_id, exer_id and knowledge_code correspond to user_id, problem_id and skill_id attributes in the original ASSIST2009-2010 csv file. The ids/codes are recoded starting from 1.

Details

Students with less than 15 logs would be deleted in divide_data.py.
The model parameters are initialized with Xavier initialization.
The model uses Adam Optimizer, and the learning rate is set to 0.002.

Correction

There is a mistake in the AAAI conference paper. Eq. (18) should be:

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
model		model
result		result
.gitignore		.gitignore
README.md		README.md
config.txt		config.txt
data_loader.py		data_loader.py
divide_data.py		divide_data.py
equation.JPG		equation.JPG
model.py		model.py
predict.py		predict.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Cognitive Diagnosis for Intelligent Education Systems

Dependencies:

Usage

Data Set

Details

Correction

About

Releases

Packages

Languages

KevinXiao1225/Neural_Cognitive_Diagnosis-NeuralCD

Folders and files

Latest commit

History

Repository files navigation

Neural Cognitive Diagnosis for Intelligent Education Systems

Dependencies:

Usage

Data Set

Details

Correction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages