You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 22, 2019. It is now read-only.
I have installed the module as suggested and run the command: srcml preprocrepos -m 50G,50G,50G -r siva --output ./test
Where siva is the directory, containing all the siva files. The memory parameters do not change anything.
My spark is very old (1.3) - could it be the reason? Is it runnable in pyspark (the latest one)?
_/usr/local/lib64/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
INFO:spark:Starting preprocess_repos-424fe007-f0db-48b7-863b-5a5b90ce5f63 on local[*]
Ivy Default Cache set to: /home/b7066789/.ivy2/cache
The jars for the packages stored in: /home/b7066789/.ivy2/jars
:: loading settings :: url = jar:file:/home/b7066789/.local/lib/python3.6/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
tech.sourced#engine added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found tech.sourced#engine;0.6.4 in central
found io.netty#netty-all;4.1.17.Final in central
found org.eclipse.jgit#org.eclipse.jgit;4.9.0.201710071750-r in central
found com.jcraft#jsch;0.1.54 in central
found com.googlecode.javaewah#JavaEWAH;1.1.6 in central
found org.apache.httpcomponents#httpclient;4.3.6 in central
found org.apache.httpcomponents#httpcore;4.3.3 in central
found commons-logging#commons-logging;1.1.3 in central
found commons-codec#commons-codec;1.6 in central
found org.slf4j#slf4j-api;1.7.2 in central
found tech.sourced#siva-java;0.1.3 in central
found org.bblfsh#bblfsh-client;1.8.2 in central
found com.thesamet.scalapb#scalapb-runtime_2.11;0.7.1 in central
found com.thesamet.scalapb#lenses_2.11;0.7.0-test2 in central
found com.lihaoyi#fastparse_2.11;1.0.0 in central
found com.lihaoyi#fastparse-utils_2.11;1.0.0 in central
found com.lihaoyi#sourcecode_2.11;0.1.4 in central
found com.google.protobuf#protobuf-java;3.5.0 in central
found commons-io#commons-io;2.5 in central
found io.grpc#grpc-netty;1.10.0 in central
found io.grpc#grpc-core;1.10.0 in central
found io.grpc#grpc-context;1.10.0 in central
found com.google.code.gson#gson;2.7 in central
found com.google.guava#guava;19.0 in central
found com.google.errorprone#error_prone_annotations;2.1.2 in central
found com.google.code.findbugs#jsr305;3.0.0 in central
found io.opencensus#opencensus-api;0.11.0 in central
found io.opencensus#opencensus-contrib-grpc-metrics;0.11.0 in central
found io.netty#netty-codec-http2;4.1.17.Final in central
found io.netty#netty-codec-http;4.1.17.Final in central
found io.netty#netty-codec;4.1.17.Final in central
found io.netty#netty-transport;4.1.17.Final in central
found io.netty#netty-buffer;4.1.17.Final in central
found io.netty#netty-common;4.1.17.Final in central
found io.netty#netty-resolver;4.1.17.Final in central
found io.netty#netty-handler;4.1.17.Final in central
found io.netty#netty-handler-proxy;4.1.17.Final in central
found io.netty#netty-codec-socks;4.1.17.Final in central
found com.thesamet.scalapb#scalapb-runtime-grpc_2.11;0.7.1 in central
found io.grpc#grpc-stub;1.10.0 in central
found io.grpc#grpc-protobuf;1.10.0 in central
found com.google.protobuf#protobuf-java;3.5.1 in central
found com.google.protobuf#protobuf-java-util;3.5.1 in central
found com.google.api.grpc#proto-google-common-protos;1.0.0 in central
found io.grpc#grpc-protobuf-lite;1.10.0 in central
found org.rogach#scallop_2.11;3.0.3 in central
found org.apache.commons#commons-pool2;2.4.3 in central
found tech.sourced#enry-java;1.6.3 in central
found org.xerial#sqlite-jdbc;3.21.0 in central
found com.groupon.dse#spark-metrics;2.0.0 in central
found io.dropwizard.metrics#metrics-core;3.1.2 in central
:: resolution report :: resolve 1148ms :: artifacts dl 44ms
:: modules in use:
com.google.api.grpc#proto-google-common-protos;1.0.0 from central in [default]
com.google.code.findbugs#jsr305;3.0.0 from central in [default]
com.google.code.gson#gson;2.7 from central in [default]
com.google.errorprone#error_prone_annotations;2.1.2 from central in [default]
com.google.guava#guava;19.0 from central in [default]
com.google.protobuf#protobuf-java;3.5.1 from central in [default]
com.google.protobuf#protobuf-java-util;3.5.1 from central in [default]
com.googlecode.javaewah#JavaEWAH;1.1.6 from central in [default]
com.groupon.dse#spark-metrics;2.0.0 from central in [default]
com.jcraft#jsch;0.1.54 from central in [default]
com.lihaoyi#fastparse-utils_2.11;1.0.0 from central in [default]
com.lihaoyi#fastparse_2.11;1.0.0 from central in [default]
com.lihaoyi#sourcecode_2.11;0.1.4 from central in [default]
com.thesamet.scalapb#lenses_2.11;0.7.0-test2 from central in [default]
com.thesamet.scalapb#scalapb-runtime-grpc_2.11;0.7.1 from central in [default]
com.thesamet.scalapb#scalapb-runtime_2.11;0.7.1 from central in [default]
commons-codec#commons-codec;1.6 from central in [default]
commons-io#commons-io;2.5 from central in [default]
commons-logging#commons-logging;1.1.3 from central in [default]
io.dropwizard.metrics#metrics-core;3.1.2 from central in [default]
io.grpc#grpc-context;1.10.0 from central in [default]
io.grpc#grpc-core;1.10.0 from central in [default]
io.grpc#grpc-netty;1.10.0 from central in [default]
io.grpc#grpc-protobuf;1.10.0 from central in [default]
io.grpc#grpc-protobuf-lite;1.10.0 from central in [default]
io.grpc#grpc-stub;1.10.0 from central in [default]
io.netty#netty-all;4.1.17.Final from central in [default]
io.netty#netty-buffer;4.1.17.Final from central in [default]
io.netty#netty-codec;4.1.17.Final from central in [default]
io.netty#netty-codec-http;4.1.17.Final from central in [default]
io.netty#netty-codec-http2;4.1.17.Final from central in [default]
io.netty#netty-codec-socks;4.1.17.Final from central in [default]
io.netty#netty-common;4.1.17.Final from central in [default]
io.netty#netty-handler;4.1.17.Final from central in [default]
io.netty#netty-handler-proxy;4.1.17.Final from central in [default]
io.netty#netty-resolver;4.1.17.Final from central in [default]
io.netty#netty-transport;4.1.17.Final from central in [default]
io.opencensus#opencensus-api;0.11.0 from central in [default]
io.opencensus#opencensus-contrib-grpc-metrics;0.11.0 from central in [default]
org.apache.commons#commons-pool2;2.4.3 from central in [default]
org.apache.httpcomponents#httpclient;4.3.6 from central in [default]
org.apache.httpcomponents#httpcore;4.3.3 from central in [default]
org.bblfsh#bblfsh-client;1.8.2 from central in [default]
org.eclipse.jgit#org.eclipse.jgit;4.9.0.201710071750-r from central in [default]
org.rogach#scallop_2.11;3.0.3 from central in [default]
org.slf4j#slf4j-api;1.7.2 from central in [default]
org.xerial#sqlite-jdbc;3.21.0 from central in [default]
tech.sourced#engine;0.6.4 from central in [default]
tech.sourced#enry-java;1.6.3 from central in [default]
tech.sourced#siva-java;0.1.3 from central in [default]
:: evicted modules:
com.google.protobuf#protobuf-java;3.5.0 by [com.google.protobuf#protobuf-java;3.5.1] in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 51 | 0 | 0 | 1 || 50 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 50 already retrieved (0kB/18ms)
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/10/03 15:50:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/10/03 15:50:55 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
18/10/03 15:50:58 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
INFO:engine:Initializing engine on siva
INFO:ParquetSaver:Ignition -> DzhigurdaFiles -> UastExtractor -> Moder -> FieldsSelector -> ParquetSaver
Traceback (most recent call last):
File "/home/b7066789/.local/bin/srcml", line 11, in
sys.exit(main())
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/main.py", line 354, in main
return handler(args)
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/utils/engine.py", line 87, in wrapped_pause
return func(cmdline_args, *args, **kwargs)
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/cmd/preprocess_repos.py", line 24, in preprocess_repos
.link(ParquetSaver(save_loc=args.output))
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/transformers/transformer.py", line 114, in execute
head = node(head)
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/transformers/basic.py", line 292, in call
rdd.toDF().write.parquet(self.save_loc)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 58, in toDF
return sparkSession.createDataFrame(self, schema, sampleRatio)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 582, in createDataFrame
rdd, schema = self._createFromRDD(data.map(prepare), schema, samplingRatio)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 380, in _createFromRDD
struct = self._inferSchema(rdd, samplingRatio)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 351, in inferSchema
first = rdd.first()
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/rdd.py", line 1364, in first
raise ValueError("RDD is empty")
ValueError: RDD is empty
The text was updated successfully, but these errors were encountered:
I have installed the module as suggested and run the command:
srcml preprocrepos -m 50G,50G,50G -r siva --output ./test
Where siva is the directory, containing all the siva files. The memory parameters do not change anything.
My spark is very old (1.3) - could it be the reason? Is it runnable in pyspark (the latest one)?
_/usr/local/lib64/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from
float
tonp.floating
is deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type
.from ._conv import register_converters as _register_converters
INFO:spark:Starting preprocess_repos-424fe007-f0db-48b7-863b-5a5b90ce5f63 on local[*]
Ivy Default Cache set to: /home/b7066789/.ivy2/cache
The jars for the packages stored in: /home/b7066789/.ivy2/jars
:: loading settings :: url = jar:file:/home/b7066789/.local/lib/python3.6/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
tech.sourced#engine added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found tech.sourced#engine;0.6.4 in central
found io.netty#netty-all;4.1.17.Final in central
found org.eclipse.jgit#org.eclipse.jgit;4.9.0.201710071750-r in central
found com.jcraft#jsch;0.1.54 in central
found com.googlecode.javaewah#JavaEWAH;1.1.6 in central
found org.apache.httpcomponents#httpclient;4.3.6 in central
found org.apache.httpcomponents#httpcore;4.3.3 in central
found commons-logging#commons-logging;1.1.3 in central
found commons-codec#commons-codec;1.6 in central
found org.slf4j#slf4j-api;1.7.2 in central
found tech.sourced#siva-java;0.1.3 in central
found org.bblfsh#bblfsh-client;1.8.2 in central
found com.thesamet.scalapb#scalapb-runtime_2.11;0.7.1 in central
found com.thesamet.scalapb#lenses_2.11;0.7.0-test2 in central
found com.lihaoyi#fastparse_2.11;1.0.0 in central
found com.lihaoyi#fastparse-utils_2.11;1.0.0 in central
found com.lihaoyi#sourcecode_2.11;0.1.4 in central
found com.google.protobuf#protobuf-java;3.5.0 in central
found commons-io#commons-io;2.5 in central
found io.grpc#grpc-netty;1.10.0 in central
found io.grpc#grpc-core;1.10.0 in central
found io.grpc#grpc-context;1.10.0 in central
found com.google.code.gson#gson;2.7 in central
found com.google.guava#guava;19.0 in central
found com.google.errorprone#error_prone_annotations;2.1.2 in central
found com.google.code.findbugs#jsr305;3.0.0 in central
found io.opencensus#opencensus-api;0.11.0 in central
found io.opencensus#opencensus-contrib-grpc-metrics;0.11.0 in central
found io.netty#netty-codec-http2;4.1.17.Final in central
found io.netty#netty-codec-http;4.1.17.Final in central
found io.netty#netty-codec;4.1.17.Final in central
found io.netty#netty-transport;4.1.17.Final in central
found io.netty#netty-buffer;4.1.17.Final in central
found io.netty#netty-common;4.1.17.Final in central
found io.netty#netty-resolver;4.1.17.Final in central
found io.netty#netty-handler;4.1.17.Final in central
found io.netty#netty-handler-proxy;4.1.17.Final in central
found io.netty#netty-codec-socks;4.1.17.Final in central
found com.thesamet.scalapb#scalapb-runtime-grpc_2.11;0.7.1 in central
found io.grpc#grpc-stub;1.10.0 in central
found io.grpc#grpc-protobuf;1.10.0 in central
found com.google.protobuf#protobuf-java;3.5.1 in central
found com.google.protobuf#protobuf-java-util;3.5.1 in central
found com.google.api.grpc#proto-google-common-protos;1.0.0 in central
found io.grpc#grpc-protobuf-lite;1.10.0 in central
found org.rogach#scallop_2.11;3.0.3 in central
found org.apache.commons#commons-pool2;2.4.3 in central
found tech.sourced#enry-java;1.6.3 in central
found org.xerial#sqlite-jdbc;3.21.0 in central
found com.groupon.dse#spark-metrics;2.0.0 in central
found io.dropwizard.metrics#metrics-core;3.1.2 in central
:: resolution report :: resolve 1148ms :: artifacts dl 44ms
:: modules in use:
com.google.api.grpc#proto-google-common-protos;1.0.0 from central in [default]
com.google.code.findbugs#jsr305;3.0.0 from central in [default]
com.google.code.gson#gson;2.7 from central in [default]
com.google.errorprone#error_prone_annotations;2.1.2 from central in [default]
com.google.guava#guava;19.0 from central in [default]
com.google.protobuf#protobuf-java;3.5.1 from central in [default]
com.google.protobuf#protobuf-java-util;3.5.1 from central in [default]
com.googlecode.javaewah#JavaEWAH;1.1.6 from central in [default]
com.groupon.dse#spark-metrics;2.0.0 from central in [default]
com.jcraft#jsch;0.1.54 from central in [default]
com.lihaoyi#fastparse-utils_2.11;1.0.0 from central in [default]
com.lihaoyi#fastparse_2.11;1.0.0 from central in [default]
com.lihaoyi#sourcecode_2.11;0.1.4 from central in [default]
com.thesamet.scalapb#lenses_2.11;0.7.0-test2 from central in [default]
com.thesamet.scalapb#scalapb-runtime-grpc_2.11;0.7.1 from central in [default]
com.thesamet.scalapb#scalapb-runtime_2.11;0.7.1 from central in [default]
commons-codec#commons-codec;1.6 from central in [default]
commons-io#commons-io;2.5 from central in [default]
commons-logging#commons-logging;1.1.3 from central in [default]
io.dropwizard.metrics#metrics-core;3.1.2 from central in [default]
io.grpc#grpc-context;1.10.0 from central in [default]
io.grpc#grpc-core;1.10.0 from central in [default]
io.grpc#grpc-netty;1.10.0 from central in [default]
io.grpc#grpc-protobuf;1.10.0 from central in [default]
io.grpc#grpc-protobuf-lite;1.10.0 from central in [default]
io.grpc#grpc-stub;1.10.0 from central in [default]
io.netty#netty-all;4.1.17.Final from central in [default]
io.netty#netty-buffer;4.1.17.Final from central in [default]
io.netty#netty-codec;4.1.17.Final from central in [default]
io.netty#netty-codec-http;4.1.17.Final from central in [default]
io.netty#netty-codec-http2;4.1.17.Final from central in [default]
io.netty#netty-codec-socks;4.1.17.Final from central in [default]
io.netty#netty-common;4.1.17.Final from central in [default]
io.netty#netty-handler;4.1.17.Final from central in [default]
io.netty#netty-handler-proxy;4.1.17.Final from central in [default]
io.netty#netty-resolver;4.1.17.Final from central in [default]
io.netty#netty-transport;4.1.17.Final from central in [default]
io.opencensus#opencensus-api;0.11.0 from central in [default]
io.opencensus#opencensus-contrib-grpc-metrics;0.11.0 from central in [default]
org.apache.commons#commons-pool2;2.4.3 from central in [default]
org.apache.httpcomponents#httpclient;4.3.6 from central in [default]
org.apache.httpcomponents#httpcore;4.3.3 from central in [default]
org.bblfsh#bblfsh-client;1.8.2 from central in [default]
org.eclipse.jgit#org.eclipse.jgit;4.9.0.201710071750-r from central in [default]
org.rogach#scallop_2.11;3.0.3 from central in [default]
org.slf4j#slf4j-api;1.7.2 from central in [default]
org.xerial#sqlite-jdbc;3.21.0 from central in [default]
tech.sourced#engine;0.6.4 from central in [default]
tech.sourced#enry-java;1.6.3 from central in [default]
tech.sourced#siva-java;0.1.3 from central in [default]
:: evicted modules:
com.google.protobuf#protobuf-java;3.5.0 by [com.google.protobuf#protobuf-java;3.5.1] in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 51 | 0 | 0 | 1 || 50 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 50 already retrieved (0kB/18ms)
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/10/03 15:50:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/10/03 15:50:55 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
18/10/03 15:50:58 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
INFO:engine:Initializing engine on siva
INFO:ParquetSaver:Ignition -> DzhigurdaFiles -> UastExtractor -> Moder -> FieldsSelector -> ParquetSaver
Traceback (most recent call last):
File "/home/b7066789/.local/bin/srcml", line 11, in
sys.exit(main())
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/main.py", line 354, in main
return handler(args)
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/utils/engine.py", line 87, in wrapped_pause
return func(cmdline_args, *args, **kwargs)
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/cmd/preprocess_repos.py", line 24, in preprocess_repos
.link(ParquetSaver(save_loc=args.output))
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/transformers/transformer.py", line 114, in execute
head = node(head)
File "/home/b7066789/.local/lib/python3.6/site-packages/sourced/ml/transformers/basic.py", line 292, in call
rdd.toDF().write.parquet(self.save_loc)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 58, in toDF
return sparkSession.createDataFrame(self, schema, sampleRatio)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 582, in createDataFrame
rdd, schema = self._createFromRDD(data.map(prepare), schema, samplingRatio)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 380, in _createFromRDD
struct = self._inferSchema(rdd, samplingRatio)
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/sql/session.py", line 351, in inferSchema
first = rdd.first()
File "/home/b7066789/.local/lib/python3.6/site-packages/pyspark/rdd.py", line 1364, in first
raise ValueError("RDD is empty")
ValueError: RDD is empty
The text was updated successfully, but these errors were encountered: