Read the research paper: Research Paper
Web users who searches for information via a search engine have very less insights about the IQ of Web documents. We trained a multiple output regression for the task of estimating the Informa- tion Quality (IQ) of Web documents based on historical data where features and assessments where collected. For retrieving the IQ scores we automatically collect their features and predict the IQ based on the patterns that our algorithm learned. The model for semi-automatically assessing the IQ of Web documents was in- spired by the work of Ceolin et al. Compared to their framework our Framework is also capable of retrieving documents inherent a given topic of interest to the user in a comparable manner, we provide descriptive insight about the content of the Web document and we increased the responsivity of the information.
- Multiple Target Regression Machine Learning
- Multi-core Processor Crawler nested with Multi-Threaded Crawler (Mixed Conccurency and Parallelism)
- Search by searchengine
- Web Server
- Fault recovery
- Pivot-Grid
- Personalized Content and Informartion Quality based Recommendation System
- Crawl the entire Web and restructure it
- Advanced Text Mining
- Ubuntu
- Python 3
- Apache 2.4
- Flask
- Gunicorn
- Init D process
In our other reposotry you will find out how we configured the tool and some suggestions. Qupid
We blocked the open ports with our internal Firewall.
Questions: ozkansener@gmail.com
Vrije Universiteit Amsterdam