Welcome to our dataobservatory.eu R, hugo, and open data ecosystem. We are very happy to guide you to the experience of open source development and open knowledge management regardless of your experience level with R or Github. We kindly ask you to take the Contributor Covenant Pledge before starting our collaboration.
“We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.”
Please read the entire covenant here.
- Name, affiliation, education details, one-line and short biography. Please, send back this bio_template.txt text file with your details or, if you know markdown, use this version. The files are identical, but your word processor may not know how to open an .md file.
- Your ORCiD to resolve ambiguity with similarly named people. You may use different library or publication service IDs, such as Google Scholar, Publeon, etc, you may provide them, too, but we do need an ORCiD ID, because most of the EU open science infrastructure and the R ecosystem uses this one. If you do not have it, please create one—it only takes a few minutes. Please add it to the bio_template.txt.
- Your LinkedIn ID, add it to the bio_template.txt.
- Your Github account name. If you do not have one, please create one. As a data curator, you may not need it, but if you contribute in our R&D or publication efforts, you will need it.
- Your Keybase account name. If you do not have one, please create one, if you want to be able to chat with us, or exchange calls, data with us in a discrete, free, open-source and secure environment. Keybase is an open-source substitute for Slack. It is owned by Zoom and it can start Zoom or Google Meet calls.
- Twitter account name, provided that you use Twitter for professional uses.
- We are seeking an affiliation with mastodon.green, which will be a climate positive, decentralized and ethical alternative to other social media. More details soon.
- Facebook acccount name, provided that you use Facebook for professional uses.
- Any other social media that you use strictly professionally.
- You should follow our file naming conventions, and avoid the use
of special characters in any file names at all times: ,
$
,:
,;
,,
,.
,"
,' tick
or backtick. - Please send us one professional portrait of at least 400x400 pixels.
Data curators do not need to be knowledgeable about data science or programming. They should have a strong domain-specific knowledge and interest in empirical data collection and data quality in their professional or research areas.
As a data curator, we will rely on your expertise to publish or release new data. Therefore, we need a filled in curator biography template with the following information about you.
You should get familiar with the following concepts. We will describe them in blogposts.
- FAIR Principles: improve the Findability, Accessibility, Interoperability, and Reuse of digital assets.
- DataCite: A persistent, standardized approach to access, identification, sharing, and re-use of datasets—this is our favored way of describing data for future use according to the FAIR principles. Many EU open science repositories will ask your publications with this documentation.
- Biblatex is a standard text file used by citation engines, bibliography management tool, and in scientific publication templates. (See for example the Overleaf Biblatex tutorial.
- Dublin Core is an older international standard than DataCite, but the two standards greatly overlap. Dublin Core was originally developed by libraries. You often may need to fill out Dublin Core properties for publication.
- You should follow our file naming conventions, and avoid the use
of special characters in any file names at all times: ,
$
,:
,;
,,
,.
,"
,' tick
or backtick.
For co-authorship, you should be familar with tools that help the assynchronous co-writing of papers. We use Github for mainly this purpose.
Additionally we need this from you:
- Your Github account name. Still not a must, but eventually it is in your interest to be able to work with Git.
- Gaining familiarity with the TeX format for scientific publishing, exporting your citations to Biblatex format.
- Share citations with us with Zotero, and open-source bibliography management tool that integrates well with browsers. Share your Zotero account name with us.
- You should follow our file naming conventions, and avoid the use
of special characters in any file names at all times: ,
$
,:
,;
,,
,.
,"
,' tick
or backtick.
We will softly onboard you if you are not familiar with Gitbhub, you can start collaborating us in Google Docs.
As an author of an article, paper, or software, you will sooner or later work with Github. All our main source files (both documents and software) are stored, shared on Github repositories. Github repositories (repos) are folders that can be synchronized with many collaborators, who can work on tasks parallel without overwriting each other’s work.
- Your Github account name
- For Windows users, it is recommended to install Github Desktop.
Get your computer ready for co-working:
-
You will need the crossplatform Java programming environment on your computer. It is cross-platform and facilitates the use of Linux, BSD/Mac OX and Windows collaboration. You most probably have it. It is a good opportunity to check if you have the latest version. If not, do upgrade, both for security and functionality reasons, and at the same time remove the old versions. Follow this link https://www.java.com/en/download/.
-
If you recently installed R, you most likely have the latest version. If not, then run
install.packages(“installr”)
and runinstallr::check.for.updates.R()
. If there is a newer R release, you should upgrade.installr::updateR()
will take you through the progress, including the moving of your already installed packages to the new R installation, however, it will not remove the old R environment. You should run the upgrader from the R GUI (you will find this somewhere on your computer, even though you may have forgotten about it because you always use R from RStudio.) -
The copying of the old R packages is not always successful. You can prepare for this by saving the list of installed packages before your I do not my reinstalling though my packages. It reminds me to remove detritus, and review my own developments.
-
One package that is worth running at all new installs is tinytex.
tinytex::reinstall_tinytex()
ortinytex::install_tinytex()
. Tinytex is a lightweight tex engine, and it will allow many tex libraries from CTAN, such as fonts, formatting tools for TeX, and so on. This is required for an efficient creation of PDF files, in package documentation or elsewhere. -
Now, when you have the latest version of R, install Rtools, too. https://cran.r-project.org/bin/windows/Rtools/
-
Now install RStudio, or, if you already have it, check if you have the latest version. (Help Menu, Check for Updates.)
-
Install the
usethis
anddevtools
packages with all their dependencies. You should runinstall.packages(“devtools”)
and see if all dependencies install without error. If not, you must figure out why some components are not installing.
RStudio is one of the best integrated development environments in the world. It facilitates cross-language development, you can simultaneously wok on R, Python, C++ (RCpp), SQL, D3, Stan code and text, and even make them work together.
-
You must connect your RStudio to your Github account. If you already have a Github account, but you have not used it recently, or did not connect RStudio to it lately, you are likely to have to do it again.
-
Github does not support password authentication since August 13, 2021. This means that you cannot synchronize your offline and online repository using your username and password combination.
-
You can no longer synchronize the repository on RStudio with a repository URL only. For example, to synchronize
https://github.com/rOpenGov/retroharmonize
, you must explicitly state on your computer to synchronize via this URI:git@github.com:rOpenGov/retroharmonize.git
, which will require the use of a -
Happy Git and GitHub for the useR guides you through the process on Linux, Mac OS or Windows platforms. Put this into practice at the end of this document.
Microskills to pick up or improve:
-
You must be able to raise an issue via Github. An issue can be a bug report, a suggestions to change how a code works, or a suggestion to add, improve, change documentation.
-
You must be able to read a response to an issue, and accept a solution offered by somebody.
-
You must be able to read our issue/taks cards on our kanban-style Github Project management tool.
-
You should be able to write, move, solve cards in the Github Project.
-
Learn how to improve our software documentation in Rmd and R files.
-
You should learn to write a so-called reprex to correctly report a bug.
-
You must use
file.path
orhere
from the here package to use computer- and operational system independent file paths. -
You should follow our file naming conventions, and avoid the use of special characters in any file names at all times: ,
$
,:
,;
,,
,.
,"
,' tick
or backtick. -
Use goodpractice to improve the code quality and readability.
-
Most of our packages depend on various components of the tidyverse, dplyr, tidyr, and purrr.
-
These packages depend among others on rlang for the
.data
pronoun and magrittr for the pipe operator. -
When using non-standard evaluation, use the modern evaluation practice of the Tidyverse and rlang, and avoid the old
.
pronoun, but the more precise.data
pronoun. Use the.data$foo
reference style. Instead ofselect(df, geo)
useselect(df, .data$geo)
.
-
Run into a problem? We use the open-source and encyrpted, privacy-sensitive competitor of Slack, Keybase. You can ask for help in our Keybase Community.
-
For any of our repositories that you would like to contribute to into your own Github profile, for example, https://github.com/rOpenGov/retroharmonize/ to yourusername/retroharmonize.
-
Send a pull request when you have something to commit to our work.
- Star this repo: dataobservatory-eu/new-contributors ⭐
Thanks! It is similar on social media to giving us a like or a 🧡.
- For this repo into your own space on Github, i.e. create a copy that you can modify or download to your computer.
knitr::include_graphics(file.path("png", "fork_this_repo.png"))
After pressing fork, you can make a copy to
https://github.com/<your-github-id>/new-contributors
. This your copy,
and if you have followed the instruction, you can download it to your
computer and edit the document with in RStudio or any text editor.
- Synchronize with R Studio. By navigating to
File Menu
->New Project
->Version control
->Git
You will end up with this dialog box.
If you have followed the Happy Git and GitHub for the useR, you have built up a secure authentication workflow that will work a bit differently than on Linux, because of deeper differences among the operational systems.
- On Windows, you paste the https:// protocol URL of your
github fork,
i.e.
https://github.com/<your-github-id>/new-contributors
. instead ofhttps://github.com/dataobservatory-eu/new-contributors
shown below.
knitr::include_graphics(file.path("png", "synchronize_with_rstudio.png"))
- On Linux, you use the SSH URL:
git@github.com:<your-github-id>/new-contributors
.
knitr::include_graphics(file.path("png", "synchronize_with_r.png"))
Whichever URL you copy into the RStudio, you will be able to download the repository contents with the Pull button (blue arrow down.)
knitr::include_graphics(file.path("png", "pull_push_with_rstudio.png"))
Once you have all the files present, add an emoji or a sentece to the
back of the README.Rmd
file, and tick the Commit
checkbox near the
name of the file. [When there are Rmd
and md
files present, always
edit the Rmd
, which will generate the md
but not the other way
around.]
If you press the Push button (green arrow up), things should upload to
https://github.com/<your-github-id>/new-contributors
without asking
your github username and password. Why? Because you can download, at
least from public repositories, without authentication anytime. You can
even download a repo in a .zip file in your browser. However, Github
since 2021 does no longer allow writing into a repository with password
authentication, only via the far more secure SSH. If you followed the
Happy Git and GitHub for the useR, your
computer, including R Studio, should be able to download (pull) and
upload (push) back files with SSH authentication and not with a
password. The exact implementation of the SSH authentication is slightly
different on Windows, Mac/BSD, and Linux/Unix systems.
If you are still being asked for a password, then you are out of luck. You can write in your github password, but you will get a message that Github no longer accepts “pushing” back files with a password authenticaion. In this case you must troubleshoot why RStudio is not aware of your PAT token used for SSH authenticaion.