Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impl. file upload endpoint #470

Merged
merged 12 commits into from
Jul 30, 2024
Merged

Impl. file upload endpoint #470

merged 12 commits into from
Jul 30, 2024

Conversation

pvannierop
Copy link
Contributor

@pvannierop pvannierop commented Jun 3, 2024

This PR will add a new endpoint that allows users to upload files to a storage location. This upload functionality is meant for data that is too large to be submitted tot the Kafka queue. After successful upload the upload endpoint returns the path of the newly created file. The application can then submit a new message to the Kafka queue that mentions this path.

For now, upload is only implemented for S3 storage.

The upload endpoint is disabled by default. It can be activated with the radar.file-upload.enabled application property.

Remarks

  • This PR add a new minio dependency. This allows file storage on Minio and on AWS S3.

@pvannierop pvannierop self-assigned this Jun 3, 2024
@pvannierop pvannierop changed the base branch from master to dev June 3, 2024 13:56
@pvannierop pvannierop force-pushed the file-upload-endpoint branch 2 times, most recently from 4ffc5fb to 750b7f5 Compare June 3, 2024 14:03
@pvannierop pvannierop marked this pull request as draft June 3, 2024 14:08
@pvannierop pvannierop force-pushed the file-upload-endpoint branch 7 times, most recently from 8532eeb to af7902c Compare June 10, 2024 13:52
Copy link
Member

@yatharthranjan yatharthranjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for adding this. Please find some comments.

@PostMapping(
"/" + PathsUtil.PROJECT_PATH + "/" + PathsUtil.PROJECT_ID_CONSTANT +
"/" + PathsUtil.USER_PATH + "/" + PathsUtil.SUBJECT_ID_CONSTANT +
"/" + PathsUtil.TOPIC_PATH + "/" + PathsUtil.TOPIC_ID_CONSTANT +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add "files" to the path too like .../files/topics to denote it is for file uploads

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

"/" + PathsUtil.TOPIC_PATH + "/" + PathsUtil.TOPIC_ID_CONSTANT +
"/upload")
public ResponseEntity<?> subjectFileUpload(
@RequestParam("file") MultipartFile file,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not accept a filename parameter too, which by default can be the random name you generate below.

Copy link
Contributor Author

@pvannierop pvannierop Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a security risk. The uploading party better not determine the filename if not needed. See OWASP Unrestricted file upload. I decided to assign server-side randomized filename for simplicity so that I did not have to do filename sanitization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, good point. Can you then add the timestamp to the generated file name when like {timestamp}_{randomUUID}.{ext}? And can also go in a folder per day like 022024)

Copy link
Contributor Author

@pvannierop pvannierop Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this feature. A timestamp is always added in front of the random filename as per your suggestion. You can opt-in for collection per day via a new property storage.s3.path.collect-per-day.

The functionality is handled by a new StoragePath.Builder class. Here is the Javadoc for reference:

 * Represents path on Object Storage for uploaded files.
 * <p>
 * The storage path is constructed to include required arguments (projectId, subjectId, topicId and filename)
 * and optional arguments (prefix, collectPerDay, folderTimestampPattern, fileTimestampPattern). The path will follow
 * the format: prefix/projectId/subjectId/topicId/[day folder]/timestamp_filename.extension.
 * The day folder is included if collectPerDay is set to true. File extensions are converted to lowercase.
 * </p>
 *
 * <p>
 * Usage example:
 * </p>
 * <pre>
 * StoragePath path = StoragePath.builder()
 *     .prefix("uploads")
 *     .projectId("project1")
 *     .subjectId("subjectA")
 *     .topicId("topicX")
 *     .collectPerDay(true)
 *     .filename("example.txt")
 *     .build();
 *
 * System.out.println(path.getFullPath();
 * 'uploads/project1/subjectA/topicX/20220101/20220101_example.txt'
 *
 * System.out.println(path.getPathInTopicDir()
 * '20220101/20220101_example.txt'
 * </pre>


public String store(MultipartFile file, String projectId, String subjectId, String topicId) {
Assert.notNull(file, "File must not be null");
Assert.notEmpty(new String[]{projectId, subjectId, topicId}, "Project, subject and topic IDs must not be empty");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably better to use specific exceptions with HTTP codes and handle them with Exception handler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not quite understand what you mean here. In the Service layer it does ont make sense to me to throw HTTP-level exceptions. The Assert in the service throws an IllegalArgumentException which is fine I think. Or ar e you not proposing to send HTTP-level exceptions from the service?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will be relayed to the app as an HTTP response, so it makes sense to have a specific error message in case they are useful. You can create new exceptions like InvalidFileDetailsException extends IllegalArgumentException and annotate with @ResponseStatus(HttpStatus.XYZ). This can then be converted to a proper response for the app in Exception handler. Alternative is to catch the assertion error in the controller and raise these exceptions there, but I prefer doing it in the service (or where the exception is raised) as you can include more context. Essentially, you are not sending HTTP responses from the service but raising usual exceptions and configuring spring to convert to responses (this avoids handling them manually and keeps all the exception handling code neatly in one place).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I think I understand. Could you please review my last commit to see whether this is what you intended?

@Authorized(
permission = AuthPermissions.UPDATE,
entity = AuthEntities.SUBJECT,
permissionOn = PermissionOn.SUBJECT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to use CREATE MEASUREMENT permissions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops you are right! Updated...

Copy link
Member

@mpgxvii mpgxvii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Would it be possible to fix the PMD errors as well?

@pvannierop pvannierop force-pushed the file-upload-endpoint branch 3 times, most recently from 75fb66e to b81fe7e Compare July 3, 2024 08:43
@pvannierop
Copy link
Contributor Author

pvannierop commented Jul 3, 2024

@mpgxvii @yatharthranjan I have fixed the PMD linting errors. It forced me to remove duplication of string constants to member variables. Normally a good thing, but for tests it makes the tests less easy to write and interpret. Can we not better suppress linting checks for test classes, or at least suppress the The String literal "project1" appears 9 times in this file; ... errors for tests?

And on a side note. I think that it would be an improvement to separate style checks and unit/integration tests in separate GitHub Action workflows.

@pvannierop pvannierop marked this pull request as ready for review July 3, 2024 09:08
@pvannierop pvannierop force-pushed the file-upload-endpoint branch 3 times, most recently from 9494b04 to b012df9 Compare July 3, 2024 09:54
Currently only implemented for S3 storage type.
@pvannierop
Copy link
Contributor Author

@mpgxvii @yatharthranjan I have added another commit that fixes a CORS issue. The Location header was not exposed to the aRMT app because of CORS protection. In the commit the CORS exception is added to the upload endpoint method. I tried to add this to the MultiHttpSecurityConfig class, but this did not work.

origins = "*",
allowedHeaders = "*",
exposedHeaders = "Location", // needed to get the URI of the uploaded file in aRMT
methods = { RequestMethod.POST }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the rest are already specified in the MultiHttpSecurityConfig, should we only use exposedHeaders here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we tried this and it dit not work. Leaving out origins caused the annotation to not work. It is almost as if the MultiHttpSecurityConfig does not apply to the endpoints. We also tried adding .exposedHeaders(...) to MultiHttpSecurityConfig but this also did not work. The annotation in its current form above the upload method was needed.

public String store(MultipartFile file, String projectId, String subjectId, String topicId) {
if (
file == null || projectId == null || subjectId == null || topicId == null
|| file.isEmpty()|| projectId.isEmpty() || subjectId.isEmpty() || topicId.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the check for blank values is being done in StoragePath, maybe we don't need the check for empty here anymore?

Copy link
Contributor Author

@pvannierop pvannierop Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that every component should enforce its own contract/API. But this is my opinion.

@pvannierop pvannierop requested a review from mpgxvii July 15, 2024 11:01
Copy link
Member

@mpgxvii mpgxvii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the updates!

Copy link
Member

@yatharthranjan yatharthranjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the substantial effort.

Just a minor comment on adding @ResponseStatus to the exceptions.

@@ -0,0 +1,13 @@
package org.radarbase.appserver.exception;

public class InvalidFileDetailsException extends IllegalArgumentException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you map these to HTTP codes using @ResponseStatus(HttpStatus.XYZ)? Perhaps Expectation failed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added! Can you check this please. I have never used this mechanism.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes looks good, thanks

@@ -0,0 +1,13 @@
package org.radarbase.appserver.exception;

public class InvalidPathDetailsException extends IllegalArgumentException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you map these to HTTP codes using @ResponseStatus(HttpStatus.XYZ)? Perhaps Expectation failed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added! Can you check this please. I have never used this mechanism.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the code being added, can you double check and add @ResponseStatus(HttpStatus.EXPECTATION_FAILED) here. Thanks.

@@ -0,0 +1,13 @@
package org.radarbase.appserver.exception;

public class FileStorageException extends RuntimeException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you map these to HTTP codes using @ResponseStatus(HttpStatus.XYZ)? Perhaps http code 500 in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added! Can you check this please. I have never used this mechanism.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes looks good, thanks

Copy link
Member

@yatharthranjan yatharthranjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, please find some comments below.

import org.springframework.http.HttpStatus;
import org.springframework.web.bind.annotation.ResponseStatus;

@ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR)
public class EmailMessageTransmitException extends MessageTransmitException {
Copy link
Member

@yatharthranjan yatharthranjan Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the MessageTransmitException (s) ever unhandled? You don't need this response status here if they are caught and handled. Moreover, these are thrown when a scheduled job is run, not in response to an HTTP request, hence, no need to add these here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

import org.springframework.http.HttpStatus;
import org.springframework.web.bind.annotation.ResponseStatus;

@ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

@@ -0,0 +1,13 @@
package org.radarbase.appserver.exception;

public class FileStorageException extends RuntimeException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes looks good, thanks

@@ -0,0 +1,13 @@
package org.radarbase.appserver.exception;

public class InvalidFileDetailsException extends IllegalArgumentException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes looks good, thanks

@@ -0,0 +1,13 @@
package org.radarbase.appserver.exception;

public class InvalidPathDetailsException extends IllegalArgumentException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the code being added, can you double check and add @ResponseStatus(HttpStatus.EXPECTATION_FAILED) here. Thanks.

Comment on lines 62 to 65
@ExceptionHandler(MessageTransmitException.class)
public final ResponseEntity<ErrorDetails> handleMessageTransmitException(Exception ex, WebRequest request) {
return handleEntityWithCause(ex, request);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed as don't need a response for these exceptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

Copy link
Member

@yatharthranjan yatharthranjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mpgxvii
Copy link
Member

mpgxvii commented Jul 23, 2024

@pvannierop Is this ok to merge?

Init at application start interfered with the deployment of RADAR-base where S3 services are installed after Appserver.
@mpgxvii mpgxvii merged commit 9e45eb7 into dev Jul 30, 2024
4 checks passed
@mpgxvii mpgxvii deleted the file-upload-endpoint branch July 30, 2024 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants