Skip to content

Commit

Permalink
Merge pull request #28 from davisjam/globus
Browse files Browse the repository at this point in the history
globus readme
  • Loading branch information
NicholasSynovic authored Jan 27, 2023
2 parents 2171cba + 4512b57 commit b9361d4
Show file tree
Hide file tree
Showing 14 changed files with 66 additions and 2 deletions.
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,11 +164,14 @@ scripts can be found in each model hub's script directory's `README.md` file.

## Pre-Packaged Dataset

An existing dataset is availible on
An existing dataset is available on
[this Purdue University Globus share](https://app.globus.org/file-manager?origin_id=55e17a6e-9d8f-11ed-a2a2-8383522b48d9&origin_path=%2F%7E%2F).

> NOTE: The [Hugging Face](https://huggingface.co) dataset is partially complete
> as of 1/26/2023. The full dataset is estimated to be availible by 1/31/2023.
> as of 1/26/2023. It is currently transferring from scratch space to the final
> destination. The transfer should complete by 1/31/2023.
If you are unfamiliar with Globus, we prepared a guide in the [globus-docs/](globus-docs/) directory.

## Example Usage of Dataset

Expand Down
61 changes: 61 additions & 0 deletions globus-docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Introduction

The PTMTorrent dataset is available for download from Purdue University via a Globus share.

[Globus](https://www.globus.org) is a data management and transfer service designed for working with large-scale data.

If you have not used Globus before, this document describes the setup steps necessary to get access to the data.
After finishing this guide, you will have downloaded the smallest piece of PTMTorrent onto your computer (less than 1GB of data).
You can follow similar steps to get the rest.

If your institution has large compute resources, the IT staff may already have a large storage system with a Globus front-end, and you can download PTMTorrent directly onto that storage system.

# Steps

1. Visit the PTMTorrent URL from the paper, and you'll get to a Globus login page.

![GLobus login page](images/1-globus-landing.jpg)

2. Sign in through your organization, Google, or ORCID. I used ORCID in this example.

![ORCID sign-in](images/2-globus-orcid.jpg)

3. Successful login through ORCID.

![ORCID success](images/3-globus-loggedin.jpg)

4. Globus asks for some more permissions.

![Globus permissions request](images/4-globus-orcidperms.jpg)

5. We have reached the PTMTorrent share within Purdue's Globus service.

![PTMTorrent share](images/5-ptmtorrentLanding.jpg)

6. *(This is a side note)*. If your institution has a Globus instance, you might be able to right-click on the item of interest, get the link, and access it from your destination Globus. This link cannot be used via `wget` or similar.

![One download approach](images/6-ptmtorrent-download1.jpg)

7. Let's download something onto our workstation. You will need to install the Globus client on your machine. Visit https://www.globus.org/globus-connect-personal and follow the instructions.

8. Now we can see two shares in the Globus view: "My Laptop" (your workstation) and "PTMTorrent" (Purdue's Globus share) side by side in two panels. On your local share, pick the folder you want the data to land in. Then, on the PTMTorrent side, select the data you want and press the "Start" button. That button has a left-arrow pointing towards the destination share.

![Local share preparing to transfer from Purdue](images/7-ptmtorrent-DownloadViaClient-1.jpg)

9. In the "Activity" tab we can see that the task has queued.

![Task queued in the Activity tab](images/8-ptmtorrent-downloadViaClient-TaskQueued.jpg)

10. An email success notification.

![Email success notification](images/9-ptmtorrent-downloadViaClient-success.jpg)

11. I went to my Downloads/ folder and ran `tar -xzvf modelhub.tar.gz` and then `cd data/modelhub`. Let's see what is in the `repos` folder:

![The repos folder lists all models from this hub, in directories corresponding to their owner](images/10-downloadSuccess-allModels.jpg)

12. Within the `data/modelhub/repos/modelhub-ai/yolo-v3` PTM package, we see a git repository with 8 commits.

![YOLO-v3](images/11-downloadsuccess-yolov3Commits.jpg)

13. Happy mining!
Binary file added globus-docs/images/1-globus-landing.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added globus-docs/images/2-globus-orcid.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added globus-docs/images/3-globus-loggedin.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added globus-docs/images/4-globus-orcidperms.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added globus-docs/images/5-ptmtorrentLanding.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added globus-docs/images/6-ptmtorrent-download1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b9361d4

Please sign in to comment.