Skip to content

extraire le PDF du thesaurus des interactions médicamenteuses de l'ANSM

License

Notifications You must be signed in to change notification settings

scossin/IMthesaurusANSM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMthesaurusANSM (DEPRECATED - See https://github.com/scossin/ExtractThesaurusANSM)

Paquet R développé pour extraire le PDF du thesaurus des interactions de l'ANSM et le transformer dans un format structuré (R dataframe). Cossin S. Interactions médicamenteuses : données liées et applications. 30 nov 2016.

Rpackage to extract the content of the thesaurus of drug interactions edited by ANSM (french national drug safety institute)

installation

To get the current development version from github:

# install.packages("devtools")
devtools::install_github("scossin/IMthesaurusANSM")

How it works

In the R folder, you'll find the extraction programs of the PDF thesaurus since 2009. PDF files are transformed into text files with the tool Apache Tika version 1.1. Then a Rscript per thesaurus transforms text files into Rdataframe (all are in the data folder). All Rscripts use the same class : "parsing_POO_thesaurus.R".

CSV files

If you are not familiar with R, you may need to transform Rdataframe into CSV files. see "exampleThesaurusToCSV.R" file

About

extraire le PDF du thesaurus des interactions médicamenteuses de l'ANSM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published