Skip to content

Latest commit

 

History

History
54 lines (34 loc) · 1.76 KB

README.md

File metadata and controls

54 lines (34 loc) · 1.76 KB

lst

length-constrained maximum-sum subtree algorithms

Processing of enron messages

  • "= \ " to ""
  • "\ " to " "
  • "=\r\n" to ""
  • "\r\n=" to ""
  • "[IMAGE]" to ""
  • "\r\n" to " "
  • "\n" to " " # collapse to one line

Def of word:

  1. only alphabetic letters: 13041 unique tokens
  2. otherwise: 21862 unique tokens

Topics

  1. davis utilities san plant plants times million utility blackouts generators commission customers trading companies percent electric officials federal wed edison California eletricity crisis
  2. ect iso enronxgate amto confidential report draft enroncc susan joe communications ken comments order david june transmission markets language chairman Ken's email
  3. bush jones president dow stock bank companies trading dynegy confidential news service natural oil credit services copyright deal percent policies Bush and Ken Lay: Slip Slidin' Away
  4. davis utilities edison billion federal generators utility commission governor plan million crisis san plants electric pay companies thursday iso southern. Davis buy transmission lines

Resources

Valid interaction.json

should contain fields:

  • message_id
  • subject
  • body
  • timestamp
  • datetime(optional)
  • sender_id
  • recipient_ids

For the last two fields, it can be replaced by participant_ids

gen_cand_trees.sh and check_events.sh

  • B_on_algos.sh: increaing budget vs treesize on various algorithms