Skip to content

Project 3 Report

aksrajvanshi edited this page May 2, 2020 · 10 revisions

Problem Statement

Our problem statement focused on the previous scenario of MFT, which only had support for 3 protocols namely SCP, Local, and S3. Moreover, the system was not a standalone system, as it required service to integrate with MFT rather than a user directly talking to it. Furthermore, it was not deployed in a distributed environment to enable better scalability and fault tolerance.

Differences from Initial Problem Statement

Initially, when we started, we aimed at making MFT a standalone system and add new transport provider support to the project. We did look into the design of the project and how it could be extended to be deployed in the cloud environment. However, since multiple teams were focusing on the same problem, we decided to focus on adding new protocol support, testing, and henceforth finding issues in the quickly evolving environment of MFT.

Problem Statement Development

We regularly communicated with the Airavata developer community regarding doubts about our contributions and also reporting the issues we found with the existing codebase. The following are the links to the communication between the pika-pika team and the developer community.

  1. Review of MFT design documentation

  2. Request for advice

  3. Apache Airavata MFT - AWS/GCS support

  4. MFT- Dropbox Transport Implementation

  5. S3 to SCP - Apache Airavata MFT

  6. Question regarding Resource and Secret service Backend in Production

  7. Apache Airavata MFT - Dashboard

  8. Airavata MFT project errors

Due to the prompt response of the developer community, we were able to successfully make contributions in the project.

Methodology

  • We started with understanding the codebase and testing the existing transport protocols. We were able to achieve this by mailing in the apache dev community and eventually by setting up the project in our local development environment.

  • We had a look at the previous contributions to the project to get an understanding of the changes required to integrate a new transport protocol.

  • We gathered information about the API regarding the parameters to transfer files to and from the transport storage provider. These parameters included factors such as authentication token, credentials file, bucket name, etc.

  • Finally, we implemented the necessary changes and created a pull request for the moderators to review the code.

Implementation

  • We started by forking the original repository and set up an upstream to stay on to date with the latest updates.

  • Then, we added an api-gateway service to interact with the existing MFT system to enable us to make RPC communication with the MFT API service application.

  • We added a new transport module to create the relevant implementations for Sender, Receiver, and MetadataCollector of the corresponding storage provider namely GCS and Dropbox.

  • We extended the resources and secret proto files on the basis of the authentication method requirements and the resource details required by the storage provider.

  • We then added the relevant methods to the resource and secret backend that were required to support our implementation.

  • Finally, we tested the code integration by testing file transfer between our newly implemented transport and the existing ones.

We also contributed to the design document that elaborates further on how to integrate a new transport with the MFT project.

Design Document for MFT transport

Evaluation

We evaluated the problem by testing our implementation by transferring to and fro between our newly integrated transport with the existing protocols. We ensured that we rebase our forked repository before integrating any of our changes. This way, we ensure that we integrate our code without bringing any new issues into the existing codebase and have integration testing in the process.

Conclusions and Outcomes

In conclusion, we were successfully able to integrate two new Transport protocols and find out some issues to qualitatively improve the project codebase. As a positive outcome, we got an opportunity to work on a rapidly evolving open source project, and understand its design and purpose. On the contrary, though we explored about orchestration platforms and service mesh, we were not able to carry out the phase 3 of the proposed project plan to deploy and integrate the system in a distributed environment.

Team Member Contributions

As a team, we had an evenly distributed contribution across the two pull requests. To list out the contributions made by the team, here are the links:

  1. https://github.com/apache/airavata-mft/pull/6

  2. https://github.com/apache/airavata-mft/pull/9