Query-Independent Document Specificity Scoring

The project calculates a pointwise query-independent document specificity score for use with document ranking.

Where standard pointwise learning-to-rank methods calculate their scores based on a term that appears both in the query and document, the pointwise learning-to-rank methods used in this project are conducted for every term in the document.

Methods

This project provides the following models:

Normalized inverse document frequency based specificity score
Term entropy based specificity score

More details on the formulas used can be found in FORMULAS.md.

Getting Started

Adding to your project

The reccommended way to add this library to you project is by including the following to your CMakeLists.txt:

cmake_minimum_required(VERSION 3.13)
project(myProject)

include_directories("path/to/static-doc-specificity/include")
add_subdirectory("path/to/static-doc-specificity")

add_executable(myProject myProject_SOURCES)
# or `add_library(myProject myProject_SOURCES)`

target_link_libraries(myProject staticspecrank)

Usage

The library has can be included in your source files with the following:

#include <staticSpecRank/Term.h>
#include <staticSpecRank/calcSpecificityScore.h>

The score for a given document can be calculated by calling the specScore::calcSpecificityScore(scoreBase, numDocsInCorpus, docSize, docTermVector) where the scoreBase variable is either 0 for NIDF or 1 for term entropy.

The docTermVector must be of type std::vector<Term>. See the file include/staticSpecRank/Term.h for details on constructing the term vector.

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository.

Authors

Patrick Cox - paddy74

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Acknowledgements

This project is based on the following paper:

Zheng L., Cox I.J. (2009) Re-ranking Documents Based on Query-Independent Document Specificity. In: Andreasen T., Yager R.R., Bulskov H., Christiansen H., Larsen H.L. (eds) Flexible Query Answering Systems. FQAS 2009. Lecture Notes in Computer Science, vol 5822. Springer, Berlin, Heidelberg

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
cmake		cmake
include/staticSpecRank		include/staticSpecRank
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
FORMULAS.md		FORMULAS.md
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Query-Independent Document Specificity Scoring

Methods

Getting Started

Adding to your project

Usage

Versioning

Authors

License

Acknowledgements

About

Releases

Packages

Languages

License

paddy74/static-doc-specificity

Folders and files

Latest commit

History

Repository files navigation

Query-Independent Document Specificity Scoring

Methods

Getting Started

Adding to your project

Usage

Versioning

Authors

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages