Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ED-2252: Add OCI oss support in Build/DataPipeline/EdDataProducts #569

Open
wants to merge 38 commits into
base: release-5.1.1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
80531c9
updated cloud-store-sdk 1.4.6
ddevadat Jan 24, 2023
c1f2363
update the storageservice parameter list
ddevadat Jan 31, 2023
f7cdf62
added commons-pool2 to poms.xml
ddevadat Apr 13, 2023
0316443
updated version for common-pool2
ddevadat Apr 13, 2023
faf2980
Updated cassandra connector version
subhashchandrab Apr 13, 2023
23eb0db
Reverting jar version changes
subhashchandrab Apr 13, 2023
981679a
Updated the jar versions
subhashchandrab Apr 13, 2023
2d99b8f
added spark-cassandra-connector-assembly
ddevadat Apr 14, 2023
35e0b49
updated spark version
subhashchandrab Apr 14, 2023
47843b7
added debug line
ddevadat Apr 15, 2023
fce0494
debug lines added
ddevadat Apr 15, 2023
c00f285
added debug
ddevadat Apr 16, 2023
eebb7d5
changed debug message
ddevadat Apr 16, 2023
259bc56
expicitly setting s3 hadoop configuration
ddevadat Apr 16, 2023
d88e33c
reverted s3 spark cconfiguration
ddevadat Apr 16, 2023
2c8ffb8
added oci specific hadoop conf
ddevadat Apr 16, 2023
32e00df
updated hadoop conf for oci
ddevadat Apr 16, 2023
615694e
added loggers
ddevadat Apr 16, 2023
099d5a0
updated etb metrics
ddevadat Apr 17, 2023
7461fda
updated spark hd configuration
ddevadat Apr 18, 2023
a9f549a
updated the spark hdp configuration
ddevadat Apr 18, 2023
ac08a4e
updated jets3t
ddevadat Apr 18, 2023
e8ba034
changed the table name
ddevadat Apr 19, 2023
79dad41
added debug
ddevadat Apr 26, 2023
b56476c
reverted reporting cass table name
ddevadat Apr 28, 2023
e921000
Added debug information
subhashchandrab May 2, 2023
3803ee5
Update FunnelReport.scala
heungheung May 22, 2023
6bc541c
Update FunnelReport.scala
heungheung May 22, 2023
281672f
Update FunnelReport.scala
heungheung May 22, 2023
a5974a0
Update FunnelReport.scala
heungheung May 22, 2023
dfad7ae
Update FunnelReport.scala - update csp bucket
heungheung May 22, 2023
b56af06
Update FunnelReport.scala - remove debug
heungheung May 25, 2023
af2e74c
Merge pull request #1 from ocisunbird/oci-5.1.0-du550
ddevadat May 25, 2023
f940bce
removed temp repo
ddevadat Aug 3, 2023
b3ac01c
combined two identical switch statements
ddevadat Aug 3, 2023
21f65bb
combined switch case
ddevadat Aug 3, 2023
bbb1f8a
removed unwanted logs
ddevadat Aug 3, 2023
063abde
removed debug prints
ddevadat Aug 3, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions adhoc-scripts/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
<exclusions>
<exclusion>
<artifactId>jets3t</artifactId>
<groupId>net.java.dev.jets3t</groupId>
<groupId>org.jets3t</groupId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
Expand Down Expand Up @@ -81,9 +81,9 @@
<scope>provided</scope>
</dependency>
<dependency>
<groupId>net.java.dev.jets3t</groupId>
<groupId>org.jets3t</groupId>
<artifactId>jets3t</artifactId>
<version>0.9.4</version>
<version>0.9.7</version>
<scope>provided</scope>
</dependency>
<dependency>
Expand Down
27 changes: 17 additions & 10 deletions data-products/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
<scoverage.plugin.version>1.4.11</scoverage.plugin.version>
<scala.maj.version>2.12</scala.maj.version>
<scala.version>2.12.10</scala.version>
<spark.maj.version>3.0</spark.maj.version>
<spark.version>3.1.0</spark.version>
<spark.maj.version>3.2</spark.maj.version>
<spark.version>3.2.1</spark.version>
</properties>


Expand All @@ -27,7 +27,8 @@
<id>jcenter-repo</id>
<name>Jcenter Repo</name>
<url>https://jcenter.bintray.com/</url>
</repository>
</repository>

</repositories>

<dependencies>
Expand Down Expand Up @@ -90,7 +91,7 @@
<exclusions>
<exclusion>
<artifactId>jets3t</artifactId>
<groupId>net.java.dev.jets3t</groupId>
<groupId>org.jets3t</groupId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
Expand Down Expand Up @@ -135,9 +136,9 @@
<scope>provided</scope>
</dependency>
<dependency>
<groupId>net.java.dev.jets3t</groupId>
<groupId>org.jets3t</groupId>
<artifactId>jets3t</artifactId>
<version>0.9.4</version>
<version>0.9.7</version>
<scope>provided</scope>
</dependency>
<dependency>
Expand Down Expand Up @@ -172,8 +173,8 @@
</dependency> -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_${scala.maj.version}</artifactId>
<version>3.1.0</version>
<artifactId>spark-cassandra-connector-assembly_${scala.maj.version}</artifactId>
<version>3.2.0</version>
<!-- <exclusions>-->
<!-- <exclusion>-->
<!-- <groupId>com.datastax.oss</groupId>-->
Expand Down Expand Up @@ -281,7 +282,7 @@
<dependency>
<groupId>org.sunbird</groupId>
<artifactId>cloud-store-sdk_2.12</artifactId>
<version>1.3.0</version>
<version>1.4.6</version>
<exclusions>
<exclusion>
<groupId>com.microsoft.azure</groupId>
Expand Down Expand Up @@ -345,7 +346,13 @@
<version>0.7.1</version>
<scope>test</scope>
</dependency>
</dependencies>

<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-pool2</artifactId>
<version>2.0</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,9 @@ trait BaseReportsJob {
val store = modelParams.getOrElse("store", "local").asInstanceOf[String];
val storageKey = modelParams.getOrElse("storageKeyConfig", "reports_storage_key").asInstanceOf[String];
val storageSecret = modelParams.getOrElse("storageSecretConfig", "reports_storage_secret").asInstanceOf[String];
val storageEndpoint = modelParams.getOrElse("storageEndpoint", "reports_storage_endpoint").asInstanceOf[String];
store.toLowerCase() match {
case "s3" =>
case "s3" | "oci" =>
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", AppConf.getConfig(storageKey));
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", AppConf.getConfig(storageSecret));
case "azure" =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ trait OnDemandExhaustJob {
fc.getHadoopFileUtil().delete(conf, tempDir);
val filePrefix = storageConfig.store.toLowerCase() match {
// $COVERAGE-OFF$ Disabling scoverage
case "s3" =>
case "s3" | "oci" =>
CommonUtil.getS3File(storageConfig.container, "");
case "azure" =>
CommonUtil.getAzureFile(storageConfig.container, "", storageConfig.accountKey.getOrElse("azure_storage_key"))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ object StateAdminGeoReportJob extends IJob with StateAdminReportHelper {
val container = AppConf.getConfig("cloud.container.reports")
val objectKey = AppConf.getConfig("admin.metrics.cloud.objectKey")
val storageConfig = getStorageConfig(container, objectKey)

val organisationDF: DataFrame = loadOrganisationSlugDF()
val subOrgDF: DataFrame = generateSubOrgData(organisationDF)
val blockData:DataFrame = generateBlockLevelData(subOrgDF)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,10 @@ object ETBMetricsModel extends IBatchModelTemplate[Empty,Empty,FinalOutput,Final
val key = AppConf.getConfig("azure_storage_key")
val file = conf("file")
s"wasb://$bucket@$key.blob.core.windows.net/$file"
case "oci" =>
val bucket = conf("bucket")
val file = conf("file")
s"s3n://$bucket/$file"
}

val scansCount = sparkSession.read
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,10 @@ object FunnelReport extends IJob with BaseReportsJob {
.drop("channel","id","program_id")
.persist(StorageLevel.MEMORY_ONLY)

val storageConfig = getStorageConfig("reports", "")
// support to other CSP and report container should be read from configuration
val modelParams = config.modelParams.getOrElse(Map[String, Option[AnyRef]]());
val container = modelParams.getOrElse("storageContainer", "reports").asInstanceOf[String]
val storageConfig = getStorageConfig(container, "")
saveReportToBlob(funnelReport, configMap, storageConfig, "FunnelReport")

funnelReport.unpersist(true)
Expand Down
2 changes: 1 addition & 1 deletion etl-jobs/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
<exclusions>
<exclusion>
<artifactId>jets3t</artifactId>
<groupId>net.java.dev.jets3t</groupId>
<groupId>org.jets3t</groupId>
</exclusion>
<exclusion>
<groupId>org.apache.xbean</groupId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ object ESCloudUploader {
.saveAsTextFile(outputFilePath)

// backup the output file to cloud
val storageService = StorageServiceFactory.getStorageService(StorageConfig(config.getString("cloudStorage.provider"), config.getString("cloudStorage.accountName"), config.getString("cloudStorage.accountKey")))
val storageService = StorageServiceFactory.getStorageService(StorageConfig(config.getString("cloudStorage.provider"), config.getString("cloudStorage.accountName"), config.getString("cloudStorage.accountKey"),Option(config.getString("cloudStorage.accountEndpoint")),Option("")))
storageService.upload(config.getString("cloudStorage.container"), outputFilePath + "/part-00000", config.getString("cloudStorage.objectKey"), isDirectory = Option(false))
println("successfully backed up file to cloud!")
System.exit(0)
Expand Down