Skip to content

Create a library of research papers! Download filtered lists from https://core.ac.uk and aggregate them into one visually appealing, Excel-friendly format! πŸ”¬πŸŒπŸ“‘

License

Notifications You must be signed in to change notification settings

CyrylSz/CORE-To-Excel-Aggregator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CORE to Excel Aggregator

Organize your research papers into a personal data hub! Easily combine complex CSV files sourced from https://core.ac.uk (the world’s largest collection of open-access research papers) into a single TXT file in an Excel-friendly format!

ss1

How to Use:

  1. Download CORE-To-Excel.zip from Releases and Extract All...
  2. Place all TXT or CSV files downloaded from https://core.ac.uk into the "CORE-Data" folder.
    • Note: Files will be processed alphabetically, so ensure the newest files are at the end to avoid incorrect checkbox assignments. Files will also be later automatically renamed in numerical order.
    • Note: The first line of each file should follow this format:

    "workID","oaiID","doi","title","authors","createdDate"

  3. Launch the CoreToExcelAggregator.jar file. The default separator is "|", changes are not recommended if you plan to proceed to the next step. If the file paths are correct, the play button will be enabled. Press it!
    • Note: Place this file in the same folder as the "CORE-Data" and "Excel-Output" folders, and do not rename them. This will allow the automatic path insertion into the text fields upon startup.
    • Note: You need to have Java installed. You can download it from https://www.oracle.com/java/technologies/downloads/.
  4. Open example-Research-Papers-Library.xlsm, and go to: Data β†’ Queries & Connections β†’ double-left-click on "output" β†’ under "APPLIED STEPS" click LMB on "Source" β†’ in the equation bar near the top adjust the path (it should end with "Excel-Output\output.txt").
    • Note: If you later change location of output.txt or your main folder, you will need to repeat this step. The Excel workbook itself isn't location dependent.
  5. Now save and reopen and you're done! The output.txt file should now load automatically upon opening the Excel workbook.
    • Note: Checkboxes aren’t automatically populated, but you can easily drag the fill handle to extend them down the first column.
    • Tip: If you've completed your library, you can disable the automatic loading of output.txt by navigating to: Data β†’ Queries & Connections β†’ right-click on "output" β†’ Properties... β†’ uncheck "Refresh data when opening the file".
    • Tip: For a cleaner view, consider hiding the second and third columns.

ss2

Contents:

  • Merging all files into one and fixing formatting issues.
  • Deleting redundant research papers from the list.
  • Packaging IDs into links for improved usability.
  • Dividing each row into 6 parts with 5 specified separators.
  • Removing unnecessary symbols and Unicode decoding.
  • Excel stuff: checkboxes, conditional formatting, macros...

Why:

  • Why aggregate already aggregated data? Unfortunately, https://core.ac.uk allows you to download only a maximum of 1000 entries at once, and throwing everything into one folder makes the process much easier.
  • This specific CSV format also makes it almost impossible to load the data in a visually acceptable way using Excel alone.