Skip to content

πFlow V1.9 Release

Compare
Choose a tag to compare
@tianyao-0315 tianyao-0315 released this 08 Oct 06:58
· 13 commits to master since this release

Features

  1. Add new visualization features;
  2. Add a new Python base image management module;
  3. Add vector database storage components such as Chroma, Faiss, Weaviate, Pinecone, and Qdrant.

Requirements

  • JDK 1.8
  • Scala 2.12.18
  • Spark-3.4.0(other spark version of piflow.jar should be built with code)
  • Hadoop-3.3.0(other hadoop version of piflow.jar should be with code)

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://master:9000
  
  #yarn resourcemanager.hostname
  yarn.resourcemanager.hostname=master
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://master:9083
  
  #show data in log, set 0 if you do not want to show data in logs
  data.show=5
  
  #server ip and port, ip can not be set to localhost or 127.0.0.1
  server.ip=your_ip
  server.port=8002
  
  #h2db port, path
  h2.port=50002
  #h2.path=test
  
  monitor.throughput=false
  #If you want to upload python stop,please set hdfs configs
  #example hdfs.cluster=hostname:hostIP
  hdfs.cluster=master:127.0.0.1
  hdfs.web.url=master:9870
  checkpoint.path=/piflow/tmp/checkpoint/
  
  #unstructured.parse
  unstructured.parse=false
  #host can not be set to localhost or 127.0.0.1
  # if port is not be set, default 8000
  #unstructured.port=8000
  #embed models path
  #embed_models_path=/data/testingStuff/models/

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh