Skip to content

Latest commit

 

History

History
138 lines (129 loc) · 10.6 KB

todo.org

File metadata and controls

138 lines (129 loc) · 10.6 KB

Todo [10/07/2014]

  • [X] Learn the org mode short cuts
  • [X] Figure out bibtex
  • [X] Get a latex plugin working nicely in sublime (latexing isn’t working because you can’t buy it)
  • [X] Setup an organisation repo to put all the org mode files and notes
  • [ ] Specifics on mathematical induction, seems intuitive, must validate
  • [ ] When we refer to “Order”, over and above the cyclic space “depth”, what else does it mean more generally? !!!! Same Cardinality, there is a bijection between them. Be absolutely sure on bijection. “Same number of elements of every order”?

Cardinality == The number of elements == the order

queries in relational databases. In STOC, 1977

As a first piece of research, the idea is to use Spark SQL, do some performance benchmarks against Impala. Then implement an integration between a relational DB and catalyst such that certain queries are optimised and show the performance uplift. There will be 2 outcomes to this, firstly some numbers relating to the number of concurrent users for a given cluster. Secondly, a comparison of the performance of certain queries before and after the externalized query index has been created.

  • [X] Cluster creation process on AWS
  • [X] Local dev environment
  • [X] Get the data generated in the small
  • [ ] Write some scripts that load the source data into parquet data using hive, not impala.
  • [ ] Get the data generated in the large
  • [X] Get impala tests working on the local vm
  • [ ] Experiment with the performance testing framework for scala
  • [ ] Get equivalent spark tests working on the local VM
  • [X] Write the performance scripts for impala
  • [ ] Get the performance test working on the local VM

./cloudera-manager-installer.bin –i-agree-to-all-licenses –noprompt –noreadme –nooptions sudo sysctl vm.swappiness=0

To copy from HDFS to S3: hadoop distcp -Dfs.s3.awsAccessKeyId= -Dfs.s3.awsSecretAccessKey= hdfs:///user/hive/warehouse/tpcds_parquet.db/customer s3://tpcds/tpcds-cdh5/customer

7563551141

Functor is your structure, something you can map over. Free is a way of encoding an AST, a generic tree is a free monad.

Learning order: 1-Algebra 2-Pure functional data structures 3-Logic and category 4-Fusion and optimisation