Skip to content

Latest commit

 

History

History
5007 lines (3612 loc) · 212 KB

TODO

File metadata and controls

5007 lines (3612 loc) · 212 KB

-*- mode: org -*- beancount: TODO

V3 / C++

  • TODO(blais): Setup ASSERT() as a stream.
  • Replace absl/strings/string_view.h to <string_view>
  • Contemplate removing Ledger and the beancount::Ledger proto. Do we really need both? Measure performance and use cases and eliminate one or the other.
  • Establish a distinction between strings and currencies in data type in Python; this allows us to process metadata correctly. See this plugin for a use case: beancount.plugins.check_commodity
  • Review and test pointer ownership and leakage. This parser does not currently apply much rigor on reclaiming memory.
  • In options `plugin_processing_mode`:

    TODO(blais): Make the default include all the core plugins, as PEDANTIC. Let that be the default. Make DEFAULT some other name. V3 should include all the pedantry by default.

For API

  • Inventory: Add a function to discard by currency.
  • Inventory: Make it easier to reduce & convert to a single currency.

Priorities

  • Fix #145

booking_full.py: # FIXME: Refactor compute_cost_number() and convert_costspec_to_cost().

{booking} {holding} {balance-where} {autogroup} {review-dcontext} {write-intro-doc} {commodity-required}

Returns (Revision)

  • Fix remaining tests for removal of open/balance entries from segment_periods().
  • Implement per account, with data structures + code to combine them together
  • Trailing windows
  • Rolling windows
  • Calculate pre-tax returns, post-tax returns
  • Output nice debugging information
  • Warn when prices aren’t within a certain date interval with their usage
  • Render a graph of the actual values of the assets as well.
  • There should be a separate web application that handles this, no need to pre-render everything.
  • You need to merge the external entries when multiple ones occur on the same date.
  • Handle the various FIXMEs.

Settlement Dates (Split & Merge)

In http://furius.ca/beancount/doc/proposal-settlement, I describe the problem of merging two half-transactions that occur at different dates, pulled by importing from two different accounts into a single one. I also discuss that this problem is dual to that of providing a way to input a single transaction with legs posting at different dates. These two problems are related and will be dealt together by providing additions to the syntax and some plugins as well.

  • Including the email thread with mharris & redstreet0 in the settlement doc.
  • Review all the documentation to convert “transfer account” to “clearing account” everywhere; that’s the correct name for this idea (thanks Filippo).
  • Limbo accounts: Start a document on handling limbo accounts, summarize all info from emails. (BTW, these are called “Clearing Accounts”; rename all docs)
  • You need to be able to support a single transaction that gets amortized over time.

Dated Postings

  • In order to create multiple similar transactions at many dates in one syntax’ entry, you could allow overriding the date on each posting, e.g.:

    2013-04-01 * Blah di bla 2013-01-01 Grocery 10 USD 2013-02-01 Grocery 10 USD 2013-03-01 Grocery 10 USD Cash

    This would create three transactions, with these dates:

    date aux-date 2013-01-01 2013-04-01 10 / 3.33 2013-02-01 2013-04-01 … 2013-03-01 2013-04-01 …

    Could be a nice way to make distributed transactions.

  • Move ‘effective date’ to the postings in my input file, using the dated postings feature.
  • Another idea would be to make @pad able to pad for a percentage of the total, so that we’re able to use @pad instead of “distribution of expenses” entries.

Exporting Data

Beancount has various reports and export mechanisms, but there remain several flaws. I want to address some of these urgently.

  • Implement output to XLS as well.
  • Complete the export of holdings to a Google Spreadsheet and completely replace my usage of Google Finance for intra-day reporting. Part of this work should result in the ability to produce a spreadsheet of various aggregations of dimensions on each stock, e.g., report intra-day movement in small, mid and large cap stock positions separately.
  • Idea: Export the Posting table to a Spreadsheet equivalent, as a single list of postings, joining transactions by an id column. That should quiet down those who think you can do this with a Spreadsheet to try it out for themselves.
  • Write a script that, given my portfolio of ETFs, will compute the total exposure of AMZN (or any other stock contained within the ETF) of the entire portfolio value. In other words, sum up the components of individual stocks across all ETFs and mutual funds, and reaggregate by those individual stocks. I believe that there is much overlap.

Documentation

I have a number of documents pending to release, some of which for part of the Benacount Cookbook which exists to provide guidance on selecting accounts and methods to book various things.

Tasks:

  • “Introduction to Double-Entry Booking.” The most important one is this intro doc. I have a very clear idea of how I want to present this, but for some reason I’ve written everything else and left this for last. I had a couple of shots at it but never really finished it. This document will also contain a section to explain how inventories “work”, that is, how amounts posted to them are matched against their existing contents. Complete this doc first. {write-intro-doc}
  • “Cookbook: Taxes.” I have a nearly complete cookbook document on booking taxes. It’s timely [April 2016] and I should try to complete that now for this reason.
  • “Cookbook: Health Care”. I have a nearly complete cookbook document on booking health-care costs. Tracking health-case costs in the US has caused me headaches occasionally but I’ve managed to come up with a nice framework for doing so. I need to finish this document to share this. (Ideally with DEDUC and COPAY legs explanation. Write an accompanying plugin to insert the deductible tracking and what-not.)
  • “Property sale example.” Beancount was begun shortly after buying my first real-estate property (a condominium loft in Montreal). Because of this, I tracked all related expenses in Beancount and I’d like to build an example script which processes my transactions to extract just those and finally calculate the precise return on the property, including every single little tidbit that was spent on it. My goal is to demonstrate that although at face value a profit was made, when you account for all the details you realize that the true return is much smaller–even potentially a loss. Part of the work here involves ensuring that all the input numbers can be pulled out of Beancount directly, without using a spreadsheet. I think that doig this might require some custom scripting and I intended for this use case to become an example of writing scripts against the Beancount API.
  • Write a worked and detailed example of generating automated transactions in the Plugins document.
  • Make sure this doc is linked from somewhere: http://furius.ca/beancount/doc/proposal-rounding
  • Create an index page of all the possible reports, from the web page.
  • Write more on saving, “Climbing the Capital Mountain.”
  • Summarize Ledger’s –limit –real –virtual –equity, etc. options.
  • Write a doc about how conceptually Beancount’s data is really just a single large joined table of Transactions JOIN Postings with some repeated fields for inventory and tags, some data types for Amount and Inventory along with accompanying functions to deal with those types. This could really help a lot of people understand the framework for extracting data from Beancount.

Misc Documentation Tasks

  • Write a script to automatically convert and upload the docs for the shell functions and what-not into a Google Docs that we can open with a web browser. Write a script to spit it out in a nice format and upload.
  • In the comparison doc: describe how Beancount has asset types
  • Comparison w/ Ledger doc: “balance sheet and closing of year”
  • Write a document that explains how to convert to and from Ledger and/or HLedger.
  • Complete “How Inventories Work?”
  • Complete intro document on double-entry bookkeepingk.”
  • Complete Design Doc
  • Change documentation script to try to download to ODT format and then batch format to ebook
  • You should have a dedicated section of your document that explains how market values are reported, that is, via the unrealized gains plugin. Also provide a market() function, to value holdings.
  • Finally bake a PDF of all GDocs documentation and add a link to it. Should be mobile-friendly.
  • Merge “Getting Started with Beancount” & “Tutorial & Example” into a single document. See comments from: https://docs.google.com/document/d/1w5wWVFuPe6H2Aeex8iqCL8YfAO6xNZgzmrnRNlvxJec/
  • Merge “A Comparison of Beancount, Ledger & HLedger” & “Beancount History & Credits” documents into one. See comments from: https://docs.google.com/document/d/1w5wWVFuPe6H2Aeex8iqCL8YfAO6xNZgzmrnRNlvxJec/
  • Move the “spreadsheet / mint / quicken / quickbooks / gnucash / sql” comparison bit out of the “Motivation” document into the “Comparison” document. See comments from: https://docs.google.com/document/d/1w5wWVFuPe6H2Aeex8iqCL8YfAO6xNZgzmrnRNlvxJec/
  • Make the four extraction methods clear, create a “part” for the four docs.
  • Document average cost inventory booking; in doing so, include this example from yegle:

    “The average cost booking could also be useful in other scenarios. For example, when you buy gift card in various discounted price, redeem it to an account, and use it later. One can manually recalculate cost-basis like this:

    2016-04-15 open Assets:CC 2016-04-15 open Assets:ITUNES 2016-04-15 open Expenses:Movie

    2016-04-15 * “15% off giftcard” Assets:CC -85 USD Assets:ITUNES 100 GUSD {2016-04-15, 0.85 USD}

    2016-04-16 * “no discount” Assets:CC -50 USD Assets:ITUNES 50 GUSD {2016-04-16, 1 USD}

    2016-04-16 * “recalculate cost basis” Assets:ITUNES -50 GUSD {2016-04-16, 1 USD} @1 USD Assets:ITUNES -100 GUSD {2016-04-15, 0.85 USD} @0.85 USD Assets:ITUNES 150 GUSD {2016-04-16, 0.90 USD}

    2016-04-16 * “rent a movie” Assets:ITUNES -10 GUSD {2016-04-16, 0.90 USD} Expenses:Movie 10 GUSD {2016-04-16, 0.90 USD}

    (http://pastie.org/private/fxtirb6nhojyqmhuop9dfq) but it would be good to have beancount handle it for me.

    This is exactly what I do as well, and had similarly thought it would be a great secondary use for average cost booking. It would be good to document this somewhere. Perhaps Martin would like it in one of his documents or at Simon’s plain text accounting site/”

    When I write the document on average cost basis, I’ll add this example in.

Auto-Generated Doc

  • Integrate submitted work on auto-generating Beancount docs in the Beancount source code itself.
  • Create a registry of all metadata fields being used in the system, with documentation for each, so that it does not end up becoming a mess.
  • Create a registry of valid experiment flags. These should be documented in a single place as well. This is all easy. How to list the available ones (bean-doctor should be able to do this from the registry).

Presentation

I’d should build another couple of presentations on Beancount. I’m thinking of making a few videos this time around, something that would be situational and educational.

  • Use impress.js to built a visualization of the DE method
  • Use photos of white board instead of Slides, use post its, maybe a chalk board. Do something original.
  • Record a video, that’s an easy way to explain how this works.
  • IDEA!!! Use drawings a-la-ThinkBig or whatever it is. This will be the perfect medium for this. Mix it with video. Start writing a detailed script.
  • Script 1: Begin with a USB key in hand. “On this 8 GB USB key, I have all of 8 years history of financial transactions in my life. Every single price paid that went recorded into an account it these.”
  • Script 2: Motivate by speaking of the results, ask “Wouldn’t it be great if a single system could do all of that?” Explain that the task of inputting all the data in a single system is a lot less work than producing all the different reports IF you have a unified system. Beancount!
  • Slide: 4 methods: bean-web, bean-report, bean-query, write script + plugin

Padding Documentation Review

Some users have found themselves very confused about the intended usage of the Pad directive. Its documentation should be reviewed and improved substantially.

  • Put this in the docs to explain “pad”

    > > Ok, restarted example, let’s say you begin > > accounting in dec 2013, you’ll have this: > > > > 2010-01-01 open > > 2010-01-01 pad > > 2013-12-04 balance > > 2013-12-08 * … > > 2013-12-11 * … > > 2013-12-17 * … > > > > eventually, moving forward, you’ll get to 2014: > > > > 2010-01-01 open > > 2010-01-01 pad > > 2013-12-04 balance > > 2013-12-08 * … > > 2013-12-11 * … > > 2013-12-17 * … > > 2013-12-22 * … > > 2013-12-29 * … > > 2014-01-02 * … > > 2014-12-04 balance > > 2014-12-06 * … > > > > Allright, now you decide you like this, and you > > want to enter statements before you started. > > You find your paper statement for november, and > > fill in: > > > > 2010-01-01 open > > 2010-01-01 pad > > ; here you insert > > 2013-11-04 balance > > 2013-11-08 * … > > 2013-11-18 * … > > … > > ; this is what was there previously > > 2013-12-04 balance > > 2013-12-08 * … > > 2013-12-11 * … > > 2013-12-17 * … > > 2013-12-22 * … > > 2013-12-29 * … > > 2014-01-02 * … > > 2014-12-04 balance > > 2014-12-06 * … > > > > Great. Now, notice how the balance for > > 2013-11-04 is probably different than that of > > 2013-12-04. If instead of a pad directive you > > had added a manual adjustment, you’d have to > > change it here. This is the beauty of the pad > > directive: it automatically does it for you. > > > > Now, let’s keep going backward in time. You dig > > around your records, you find September’s > > statement, but you cannot find the statement > > for October, maybe it’s lost, or somewhere > > else. Fine! You insert a pad directive to > > account for those missing transactions: > > > > 2010-01-01 open > > 2010-01-01 pad > > > > 2013-09-04 balance > > 2013-09-05 * … > > … september transactions > > 2013-09-30 * … > > 2013-10-04 balance > > > > ; padding for missing October statement, > > where is my statement? > > 2013-10-04 pad > > 2013-11-04 balance > > > > … november transactions > > 2013-11-08 * … > > 2013-11-18 * … > > … > > 2013-12-04 balance > > > > 2013-12-08 * … > > 2013-12-11 * … > > 2013-12-17 * … > > 2013-12-22 * … > > 2013-12-29 * … > > 2014-01-02 * … > > 2014-12-04 balance > > > > 2014-12-06 * … > > > > This is the full example.

Improve this bit:

> But the detailed explanation cannot be found. There’s only one phrase: ?Think > of the Equity account as something from the past that you had to give up in > order to obtain the beginning snapshot of the Assets + Liabilities balance.? > > Great comment. I’ll improve this.

More user comments:

> After doing my research, I found about debits and credits, which in Beancount > you represent with positive numbers and negative numbers respectively. I > found that having a name for each group of accounts helps me to think of them > at the same time, e.g. Liabilities+Equity+Income as part of a common thing, > instead of having to research each of it independently. > In the documentation you start speaking about numbers, then about signs, then > about grouping the accounts. Maybe it’s better to go top-down and start > saying that there are two types of account (usually +, usually -) and then > divide each group further. > > I will change that, thanks for the comment.

Contributor agreement

  • See note at the bottom of: https://github.com/fourier/ztree

    Since ztree is a part of GNU ELPA, it is copyrighted by the Free Software Foundation, Inc.. Therefore in order to submit nontrivial changes (with total amount of lines > 15), one needs to to grant the right to include your works in GNU Emacs to the FSF.

    For this you need to complete this form, and send it to assign@gnu.org. The FSF will send you the assignment contract that both you and the FSF will sign.

    For more information one can read here to understand why it is needed.

    As soon as the paperwork is done one can contribute to ztree with bigger pull requests. Note what pull requests without paperwork done will not be accepted, so please notify the maintainer if everything is in place.

  • Great example (from Camlistore project link): https://cla.developers.google.com/about/google-individual

Google Docs

  • Write a script to download and bake all my PDFs docs in a mobile-friendly format.
  • I’d like the documentation links to open in “View” mode by default, YET still allowing the user to switch to “Suggestion” mode if they want to. See project volteface to complete this, this does not have to be a part of Beancount, this should be its own presentation project.

Complete Display Precision Work

Automatically determining the required number of fractional digits to render numbers with in reports can be surprisingly more complicated than it appears. Statistics on the numbers visible in the input file can be used to figure out good default values and options exist to override them. In addition, while some numbers should be rendered at their most common precision, other numbers–prices and rates–need to be rendered at a higher precision. Moreover, numbers rendered from an import tend to be produced with the minimal number of fractional digits but to render them nicely we should normalize that to the most common precision, while rendering more digits if they are present.

For all the reasons above, at some point I built beancount.core.display_context, which provides an accumulator (“context”) for statistics on precision and a formatter that can be passed around and used by the various rendering methods to produce sensible output.

Tasks:

  • At the moment, the Formatter objects are defined to render a single precision for its numbers; this is problematic, because some of the to_string() routines of the core objects print out units, cost and price numbers at the same time. The Formattter object needs to be extended to support formatting all three.
    • Then, the core data structures should be reviewed for correct usage of this new, better formatter object.
  • When I implemented the display_context, I did not review the entire codebase for consistent usage. I’m a bit ashamed to admit that some of the code uses it, and some doesn’t. It was a bit thorny to implement and after spending much time getting it to work right I did not finish converting all the existing code to it (I should have). In particular:
    • The reports code does not use it everywhere. Review all reports code for proper use of DisplayContext.
    • bean-web does not use it everywhere. Review all that code.
    • The rendering code used by bean-query has a DecimalRenderer object which uses its own method for determining the precision should clearly be converted to use the DisplayContext instead.
  • In beancount.ingest, in order to redner numbers pulled from imported data nicely for user review, I need to create a new rendering type of max(MAXIMUM, MOST_COMMON) for the DisplayFormatter. This type will always render at least the most common number of fractional digits but if required, render more fractional digits. I’ve found over time that this would be the most palatable rendering style to produce: learn the precision from the Beancount input file but render the full precision found in the imported files.
  • Make the printer support explicit tolerance syntax. This is crucial for round-trip.
  • Implement the maximum number of digits by accumulating the largest possible sum of all the ABSOLUTE values seen in the file. This should be a good basis for compute the maximum width possibly required to render with Align.RIGHT. There is currently a bug whereby that’s not being computed correctly, the current algorithm is insufficient for accumulated balances, there is a bug.
  • Create a page in the web view that lists the various precision - by example - inferred in the global DisplayContext.
  • Make setting the precision from bean-example easier (provide a method to create that format and update without conversion on the DisplayContext itself).
  • In the DisplayContext, implement caching of the formatters created to increase speed, especially for printing a single entry repeatedly.
  • Implement reserved number of digits.
  • Add docstrings to DisplayContext.
  • In the EntryPrinter(), figure out the maximum width of accounts and set it up.
  • Add a “target num columns” instead of a “min_width_account” to the EntryPrinter and figure out the min_width_account automatically from it, depending on whether we’ve got render_weights on or not. Use the longest possible number of integral digits required from the DisplayContext in order to make this tight.
  • Add an option for the DisplayContext to issue a warning if numbers are rendered through it that lose some precision.
  • Add “display_precision” input file option whereby the user can set the precision to be used by each currency.
  • This needs to be handled properly, the last column needs to have correct precision rendering, using the DisplayContext:

    bean-query trans-pricing.beancount ‘select account, sum(position), cost(sum(position)), convert(cost(sum(position)), “EUR”) group by account’ account sum_position cost_sum_posi convert_cost_sum_position_c_ ----------------------- --------------------------- ------------- ---------------------------------- Assets:Investments:Cash -734.10 USD -734.10 USD -644.2864665613480779357556609 EUR Assets:Investments:HDV 10 HDV {73.41 USD} 734.10 USD 644.2864665613480779357556609 EUR

  • bean-doctor context does not render all digits… it should render all the numbers as represented in memory. Don’t round those numbers.
  • Review all the reports code and convert it all to render using the DisplayContext, make the query output also use it, and review rendering precision for cost & prices and maybe find a self-contained solution for passing two separate formatters somehow. {review-dcontext}

Move Much of Core to Plugins

Some of the functionality currently residing in the Beancount core could be moved to plugins. For example, processing directories for auto-creating document directives should not be part of the core (it’s there for historical reasons: plugins did not exist at the time). I’d like to do that in order to simplify the Beancount core itself.

Document Directives

Document directives should be moved to their own plugin and numerous improvements can be carried out on them. Their association with transactions as well as avoiding collisions when creating them autoamtically would all be improvements to them.

  • Move the document option which seeks document files from directives to a plugin.
  • IDEA: Create a plugin that will convert “doc:” metadata to a document file, that will search for a unique string name in all the filenames and associate the filename with this directive via a link or something.
  • Write a proposal for implementing a transformation on a specific set of transactions, that supports capital gains with commission taken into account.
  • Can we automatically add a ^LINK to the document directive in order to associate a PDF with a document?!? -> For trade tickets. Maybe let the modules provide a import_link() function on the associated PDF files? (This is related to ^64647f10b2fd)
  • Document directives created automatically from directories of files should ignore documents already associated via an explicit documents directive; just ignore files with the same absolute name.
  • Documents found in parent directories don’t end up creating a directive because we skip them because we only restrict to accounts which have had an open declaration… this is probably not what we want, in order to maximize the number of documents captured by this. {fa96aa05361d}
  • Idea: A link between a transaction to a document can be created by associating a document’s checksum as the link of the transaction instead of a unique filename. If Beancount could associate them - and it could, it has access to the document files and the corpus of transactions - the web interface could insert a special link between the two. Maybe we could do the same thing with the filename as well.
  • A better idea to do this would be to allow specifying an explicit document directive, and finding document directives from files that are already specified should not re-create them. This way you can specify both the document and a transaction and use a common link as a natural way to associate them, e.g.:

    2014-06-20 document Income:US:Employer:GSU “2014-06-20.employer.0000000.pdf” ^ee63c6fc7aa6

    2014-06-20 * “PAYROLL” | “Refund for fractional shares” ^ee63c6fc7aa6 … …

  • Document finding from files should not create documents that have been explicitly specified in the ledger. Avoid duplication! This is an important fix to make, that will allow both to co-exist together.
  • Make document directive accept links, so that explicit documents could be associated with specific transactions.
  • Generalize the scope of Document directives so that they don’t just point to filenames, but can hold references to arbitrary document ids.
    • Rename the field from ‘filename’ to ‘document’
    • Make the verification of that field against a filename option, enabled by a plugin.
    • Notify bw3443 about this.
  • Do we need a dedicated web page for listing all documents? This page could include documents without a date, could be rendered as a tree-table, with the list of each document in the corresponding account. Maybe that’s overkill. DK.

Old Notes

  • You could define some special tags, like ‘document’, which could automatically try to find the corresponding document in the repository and insert a link to it in the web page. I already have a managed stash of document filenames… something like this:

    2014-05-20 * “Invoice from Autodesk” #document: 2014-05-20.autodesk.invoice200.pdf Income:US:Autodesk -3475.20 USD Assets:US:Checking

    A document filename that does not get resolved could spit out a warning in order to keep the file tidy. This is a nice idea… perhaps nicer than just insert entries for documents, an actual link. Not sure if it would make that much of a difference though. Something to ponder.

    Create a plugin that will convert “doc:” metadata to a document file, that will search for a unique string name in all the filenames and associate the filename with this directive via a link or something.

Balance Directives

  • Balance checks could also be a plugin.

Open/Close Directives

  • I think if you relax the assumptions about having Open and Close directives for all accounts, those could even be moved to a plugin. Without this ‘open_close’ plugin, accounts would just get auto-created and no error output if they weren’t. With the plugin, we would have current strict behavior. This means that non-plugin code that requires the full set of accounts from a list of entries would not be able to rely anymore on the presence of open entries, and so would the validation.

Pad Directives

  • Put all the Pad code into a single file as a plugin, same with Open Close, and Balance. Maybe we can organize those codes to be all localized in single files, and for many of these features, they can be implemented in self-contained plugins with all their codes together! openclose.py, pad.py, balance.py, etc. I think even ‘event’ directives can become those. And maybe a good way to disambiguate between ops adn plugins is just this… maybe ops is non-plugins, e.g. prices, summarize, etc.
  • Pad could be a plugin, definitely.
  • You could make the narrations for padding and summarization transactions specifiable via options.

Dynamic Directives / Configurable

  • (old notes) I recently teased out that many of the basic functions can be implemented as individual stages of transformations on the list of directives. This started out as a way to add plugins by adding a custom transformation stage, but now I see that if I can make the parser able to consume generic syntax that might allow extensions, and to allow these plugins to specify new directive types for extensions, I might be able to shove a lot of the existing functionality into nice isolated plugin modules. Even functionality as basic as Balance checks. I’m not going to to do this in the first release, but I want to set the stage for it.
  • Make the plugins able to register types with the parser… this should allow the parser to call back on the plugins to create the appropriate types… this means true extensibility throughout! This is a fantastic idea… do this after v2 ship. Maybe they get parsed as a special “Unknown” directive that accepts a grab-bag of strings and tags and accounts and amounts and they get replaced by the plugins; whatever Unknown trickles through would generate a warning in the errors.
  • It should be possible to make the parser accept unknown directives that accept an arbitrary list of accounts and string parameters, like this:

    2014-06-01 unknown Assets:US:CreditCard “Something”

Import (Ingest)

In Q1 2016, I converted the entire LedgerHub codebase to a leaner, simpler and better set of code under beancount.ingest. I finished all the work to convert what was previously there and while I’m quite happy with it, my original ideas for what the import code could do went a bit beyond what it currently does. It involves a number of small but impactful improvements, such as adding an on-disk cache so that repeating slow conversions (e.g., from PDF) would not have to be repeated, or to carry out some default cleanup up common input files. I also want to add a good CSV importer, and maybe a QIF one.

CSV Importer

  • Separate the concern of the file format and table importers. This is a “table importer” which happens to have a CSV container. Similarly, XLSX containers are other table importers. Name things appropriately.
  • Create a good, default, useful, configurable importer for CSV files.
  • Write a CSV file sniffer that automatically infers which columns should match which fields. Try to make this smart, e.g., if some column “mostly” increases, it would be the balance column. This should be for fun (I don’t need it).
  • Use new CSV downloads to build the importer; start using it.
  • The sniffer should support column types:
    • Date
    • Description
    • Change
    • Credit
    • Debit
    • Balance

    And should tolerance some cross-checking of the balance against the amounts.

Solve Slowness Problems on Mac

  • Trying to bean-extract from LC on Mac takes FOREVER. Debug this thing damnit.
  • Implement on-disk cache now.
  • Implement logging of time take by each importer (verbose switch). In find_imports(). Do nice logging.
  • Fix ugly importer names from my configuration.
  • Uniformize all directory names from my configuration.

Auto-Insert

  • AWESOME IDEA: Write a script that, given an input file with various sets of transactions, will find sections with most similar transactions in a destination file (the main file) and automatically insert those transactions in the destination file in the right place and run a diff with it. This should be part of the import suite. Do this as a post-process to import! Maybe use org-mode or other separators as a way to segregate groups of transactions.

Regression Testing Improvements

  • Make example and real importers into properly named modules, not package __init__.py files and move tests to their own _test.py files, as per usual.
  • Improve the regression testing utility function: Allow the _test files to source outside of the same directory, but keep it easy to do so as well. Some people may want to share their importers yet test on real data too. This should be well supported. (For ‘beansoup’.)

Future Features

  • When multiple files match (as is the case for the A:CA:R importer) make file_one_file() put a copy of the file in each place instead of issuing an “Ambiguous accounts from many importers” warning.
  • Make beancount.ingest.importers.regexp.RegexpImporterMixin accept a “convert” optional argument that would be called by its get_text() method, and would allow us not to have to derive in order to support non-text files.
  • Implement formatting that’s “minimum most-common” and use it on the importer’s output, it would really improve the output of it, which is currently literal, e.g., it just prints what’s from the file, at that precision. I hate correcting those manually.
  • In identify, mark as ambiguous if two handlers match.
  • Run directories check on each run of file? Disable with an option.
  • Make sure that when an exptest fails it includes the filename of the test somewhere in there; it’s annoying when it’s not present there.
  • Printing transactions should use the maximum precision between display context and natural. Create a new DisplayContext style, on top of “COMMON” and “MAXIMUM”, call it something else, perhaps “COMPLETE”. Should be max of context and natural.
  • Implement on-disk caching for conversions. This is important–it’s quite frustrating to wait a few times too many. Do this now.
  • The code that auto-files statements needs to detect and de-dup files which have the same contents, because it is likely some files will be downloaded twice, it happens a lot.
  • When payee == narration, remove one of the two. This should be a generic and simple enough check to apply. This is a common occurrence.
  • There should be an automatic check that the accounts output by the extraction are all valid Beancount/Ledger account names.
  • Duplicate entries should cover all directive types (for balances).
  • Also, similarly, if one is entirely contained in the other, as happens often in OFX files, remove one. For example, here’s what some OFX imports may look like with the current OFX importer:

    “DDA WITHDRAW MG21 MTL QC A / DDA WITHDRAW MG21 MTL QC ABM OPS MTL ABM O MONTREAL C AN”

  • Validation of directories (aka bean-doctor directories) should be automatically invoked by bean-file, and that would make sense.
  • Implement some hook to provide your own auto-categorization.
  • Provide some module that will attempt to categorize automatically, that can be inserted into this auto-categorization hook.
  • It would be super awesome if you could automate away the insertion of new transactions in the input file, following the previous transactions… we can find a heuristic to do this. That would be really cool.
  • We will want to somehow “normalize” and merge payee names, because some of that differ very little are obvious for the same business… this would be useful. What kinds of tools will we need for that?

    Clean Import: The new importers should be able to strip the non-payee parts of the payee name, e.g. NEW YO, SAN FR, etc. Maybe we could let the user provide a filter function to sanitize the names of the payees, or maybe more generally just a filter on the entries before printing them out. This way you could provide your very own custom filter function that cleans up anything you don’t like.

  • Importer should warn when not all the sections within have importers for it (R. OFX has many sections and one of the sections I needed - credit line - was missing in my importer config, so I discovered this). If we do this, this means that we may want to provide a “no-op” importer in order to silence that warning in case that’s what’s desired.
  • Write a generic import routine that will try to heuristically match partially completed transactions from an existing Ledger. Use some NLP or somesuch matching algorithm.

    Given some incomplete transactions, complete them heuristically based on previous contents of the ledger. This should make import a lot easier. This should be generic and work across all importers, a single function call.

  • Provide the previous ledger as input to extract(), some people may want to do per-account matching. See Filippo’s request on this, whereby he wants to use the previous ledger’s list of accounts to configure his importers. I suggested he load the previous file explicitly (taking advantage of the cache) in his configuration for now, but this is something I may want to revisit in the future.
  • In ingest, use /usr/bin/strings as a last resort if all other PDF converters fail.
  • When we import, if a file was not detected, don’t spit out an org text line. Still doesn’t work.
  • Deal with the problem of matching Google Wallet amounts directly… attempt to match a matching directive in the other account, and insert a link into the imported one.
  • Because one downloads all files manually, oftentimes the same file is downloaded twice. This should be detected and the duplicated identical file should be ignored entirely.
  • Transactions within the set of newly imported ones should never duplicate each other. Right now they do (see Transactions-Download-07-14-2016.csv).
  • It would be useful in many circumstances to rename the files but not to move them to their destination directory. For example, I’d rename the files and look at the filenames when processing PDFs manually to figure out what they’re for.
  • A very interesting idea is for the CSV importers to cache already imported entries in a separate file, e.g., one signature for each row, and to use that to avoid double-importing. OTOH, it creates a potential problem whereby the result of an extraction is not stored in the Beancount input file and so further extractions ignore unseen transactions. An idea that is better than both of these would be to create a unique link made up of a short checksum for the import row and to cross-check this special link when considering the duplication, e.g. ^import-signature-c21c0fc8fe13 or something like that. That would work.

GnuCash

  • Write a converter of GnuCash XML files into Beancount.
  • Contact xentac@ for an example GnuCash input file with 2-3 years of data. Willing to share (see email).

Betterment

  • Process xentac’s Betterment example PDF for the booking branch, that would be great input for a PDF converter, and figure out which lot identification method Betterment uses for tax-loss harvesting and rebalancing.

Make Commodity Directives Required

{commodity-required}

Zhuoyun Wei made a comment on a doc that account names are required to be declared by default while commodities don’t. However, at this point they both have dedicated directives to declare them. Commodity directives are so far only used to hang metadata off of them. He suggests that Commodity directives should be required to be present by default. It would make the behavior similar as that of accounts. It would be more pedantic by default. Instead of a check_commodity plugin I could provide an auto_commodity plugin similar to auto_accounts, so you could use that in order to not bother. The stream of transactions would assured to always contain a Commodity directive if one appears, like for accounts.

https://groups.google.com/d/msg/beancount/RHaV16bXlJs/Gyy5plujFgAJ

  • Do this.
  • OLD DESCRIPTION: options[‘commodities’] is currently where the list of all commodities seen from the parser lives. The beancount.core.getters.get_commodities_map() routine uses this to automatically generate a full list of directives. An alternative would be to implement a plugin that enforces the generation of these post-parsing so that they are always guaranteed to live within the flow of entries. This would allow us to keep all the data in that list of entries and to avoid depending on the options to store that output.

    This should probably be combined with a similar step that similarly enforces all unopened accounts to have an Open directive as well.

  • Make the onecommodity plugin skip accounts with multiple declared commodities. It should apply only to accounts which have no declared commodities. Then turn it on in my own file.

Booking

The ‘booking’ branch is one of the biggest branches ongoing in Beancount: it implements the proposal at http://furius.ca/beancount/doc/proposal-booking. It essentially consists in three big changes: (1) it allows a lot more opportunities for interpolating missing input, and (2) it changes in how lots are applied to inventories. Input specified in reducing lots will be interpreted as a partial match specification against the inventory of the corresponding accounts. This should allow users to provide the minimal amount of information and produce a specific lot. Augmenting lots always inherit the date of their transaction which will allow us to report on trades correctly, including automatically determining short vs. long-term holding periods, and finally (3) it introduces syntax to automatically merge lots together in order to deal with average cost booking. These changes will also allow aumatically resolution of ambiguous lots which will invoke a particular method chosen by the user, e.g. FIFO booking.

In order to carry on during the tax season for TY2015, I implemented a kludge in the “carry_date_and_book_cost” branch. This will get removed.

Tasks:

  • In the average cost method, add an option to determine whether explicit lot information should just be lost silently or if it should trigger an explicit error.
  • When lots are merged in AVERAGE booking, decide whether to keep the older or youngest lot using an option (add an option for this).
  • Make the “reduction_method” (rename from “booking_method”) per-account and support the old method in the new code, as “FIFO”. It really just is FIFO but now there will be lot splits instead of single lots being merged; this should work. Or you could call it “PRICEONLY”.
    • Provide a global option “reduction_method” for its default value across all accounts. Set the default to “PRICEONLY” so that no changes occur by default.
    • Dispatch the new ‘full’ booking method to reduction_method “STRICT”.
    • Support “NONE” and “AVCO” after that.
    • “AVDO” should be able to work even without the star marker.
  • Remove the horrible CARRY_DATE_AND_BOOK_COST kludge; do this by finishing the ‘booking’ branch and making the lots conform exactly to those present in the inventory balance, so that the inventory booking can proceed by comparing the entire cost object precisely…
  • Complete testing incomplete output with CostSpec:
    • This causes a problem because a tag is parsed in ‘#9.95’: 10 AAPL {45.23#9.95 USD}
    • Make sure parsing {3.00 # USD} does not produce the same CostSpec as {3.00 USD}.
    • Remove OLD comments from parser.py and update the description to use the xposition class.
  • Remove the beancount.ops.documents plugin, or perhaps make the loader operate in a raw mode, e.g. not running any plugins.

    egrep –include=’*.py’ -srn ‘parse_(string|file)’ /home/blais/p/beancount/src/python/beancount

  • Write test for beancount.parser.booking_full.
  • Merge beancount.core.interpolate into booking.
  • Merge beancount.ops.validation.validate_inventory_booking() into booking.
  • This causes a problem because a tag is parsed in ‘#9.95’: 10 AAPL {45.23#9.95 USD}
  • (Annoyance) Make interpolation.balance_incomplete_postings() return a new object and not work destructively. No reason not to.
  • (Annoyance) All functions that return an (entries, errors, options_map) triple really should return (entries, options_map, errors), in this order.
  • Improve the ability to detect a user-specific tolerance in is_tolerance_user_specified(). This heuristic is not great, find some better way.
  • Create a b.c.data.transactions_only() function to this common type of filtering and grep/replace everywhere. This is just too common not to create a utility.
  • Unref the constants in finalizer for module, here: {48414425cf78}.
  • Write a test to ensure that an auto-posting with no effect results in no new posting getting inserted.
  • Implement an optional feature that disables the merging of inventory lots for all lots except lots without cost. In other words, if there’s a cost object, don’t even compare them or try to merge them together; only lots without cost would automatically merge.

    I believe that this probably wouldn’t cause any problems in selecting reducing lots and could simplify the “rules” for how inventory booking works. I’d want to test it out on my real file before committing to the idea. Removing merging of lots-at-cost would make the merging logic simpler and easier to understand.

    But what about something like this? Do we want multiple lots here?

    2005-12-29 * “Dividend on NB550” Assets:CA:RRSP:NB550 1.340 NB550 {18.2348 CAD} Assets:CA:RRSP:NB550 28.646 NB550 {18.2348 CAD} Income:CA:RRSP:Dividends

  • Test a conversion of shares with lot-date, e.g.:

    2000-01-18 * Buy CRA Assets:CA:RBC-Investing:Taxable-CAD:CRA 4 “CRA1” {232.00 USD / 2000-01-18} Assets:CA:RBC-Investing:Taxable-CAD -1395.43 CAD @ 0.665027984206 USD ; cost

    2000-02-22 * CRA Stock Split 2:1 Assets:CA:RBC-Investing:Taxable-CAD:CRA -4 “CRA1” {232.00 USD / 2000-01-18} Assets:CA:RBC-Investing:Taxable-CAD:CRA 8 CRA {116.00 USD / 2000-01-18}

  • Create a command in bean-doctor which lists all of the lots and their changes for a particular account. This is meant to be a debugging tool for booking algorithms. The rendering should be clear and detailed.
  • TEST CASE: When a label is provided on a lot, make sure that the label is unique throughout the entire file, not just on the inventory the lot is being inserted into.
  • Rename beancount.core.interpolate to beancount.core.tolerances
  • An important question: on an empty inventory, does the following transaction work?

    2016-04-23 * “Buy and sell on the same transaction” Assets:Investments:HOOL 1 HOOL {100.00 USD} Assets:Investments:HOOL -1 HOOL {}

    One way to handle this would be to process all account augmentations before reductions. But that’s not really possible, because the augmentation may require interpolation first. Hmmm. Difficult question. {f89b5b01e568}

    [This is being handled in the “booking” branch.]

  • Add an option “merge_lots” whose default value is TRUE, and which can be set to FALSE in order to avoid merging augmenting lots in the Inventory. It this is false, there’s the possibillity of an inventory owning two lots which cannot be differentiated, and the only way for the user out of that situation would be to insert unique labels (indicate this somehow). Write unit tests for this. And try running it over my personal file to see if there’s any instance of it ever happening (I don’t think so).
  • Consider building a fromstring() constructor for Cost and CostSpec. This may make writing tests even easier and would complement what’s there already.
  • Deal with all the FIXME cases in the new booking code.
  • Does the Inventory object require changes to make sure the dates are always set? Would that not make sense?
  • Write scripts to compare with the new and old method, on my own large file:
    1. Compare balances at random points in time (many of them)
    2. Compare transactions one-to-one.
  • There’s a fair bit of sign change in the new booking code. Do test the case of negative inventories!

Auto-Link Booking Transactions

  • Automatically create a link between transactions that book each other. I’m not sure how I’m going to implement that - perhaps in the lot matching, a hash to the original entry will be kept in the lot - but we should be able to update the links to all the transactions that book together.

    This will be a great debugging tool as well… a very powerful idea that can be implemented entirely in a plug-in.

Original Idea for this Branch

  • Lot improvement: the lot specification on a reducing posting is only present to disambiguate which of the lots to reduce or match against. Maybe we should provide a different syntax when an expected reduction takes place, this would be allowed:

    (augment) 2014-06-17 * Assets:US:Investing:HOOL 10 HOOL {523.45 USD / i-want-more}

    (reducing) All of the following should be allowed: Assets:US:Investing:HOOL -7 HOOL ; possibly ambiguous Assets:US:Investing:HOOL -7 HOOL [] ; possibly ambiguous Assets:US:Investing:HOOL -7 HOOL [523.45 USD] Assets:US:Investing:HOOL -7 HOOL [2014-06-17] Assets:US:Investing:HOOL -7 HOOL [i-want-more]

    By enforcing a distinct syntax, the user is telling us that this leg is expected to reduce an existing position. This information is useful, in that it avoids possible mistakes. I like the explicitness of it.

    Sufficient debugging output should be provided from the “print” command to be able to identify which lot is being matched against and why. We need to provide more transparency into this.

  • FIFO or LIFO booking could be “enforced” simply by declaring the expected booking method of an account, and then issuing an error when explicit entries deviate from that method. This is an easy idea… would be very useful. The automatic method would only be used to resolve ambiguity! This is nice.

Old Notes

  • Make a temporary hack to disable strict checks on a per-account basis. This will keep us going against average cost until the full inventory proposal is implemented.

    “The inventory booking proposal for average booking won’t be implemented in the next few weeks… I’m tempted to think that maybe I should provide a way to disable the strict balance checks in the interim. This way we could enter the transactions without matching lots strictly… at least all the data would be present and the balance checks would work. Would people think it’s a good idea? I would do this by extending the default value for the type of booking is intended to take place (as in the inventory proposal) and add a new value for it, i.e,. in addition to STRICT, FIFO, LIFO, AVERAGE, AVERAGE_ONLY, I would add NONE. I would use the proposed syntax extension for the Open directive, e.g.

    2014-08-84 open Assets:US:Vanguard:VIIPX VIIPX “NONE”

    This also means that you could setup all your accounts to remove all inventory booking, which results in a booking method similar to Ledger (no checks and no errors), by setting the default value for it, like this:

    option “booking_method” “NONE”

    This could appeal to those who would like less checks, like Ledger, or who are converting their Ledger ledgers to Beancount.

    Down the road, the inventory booking would use that and implement all methods, but for now, only the balance check would consult that value and disable the check if the default booking method is “NONE”. I think I could easily hack that in a few hours.”

  • Implement the proposal
  • (design) New inventory booking:
    1. for each posting, classify by currency
    2. for each posting at cost, classify whether position augmentation or reduction
    3. For position reductions, match against inventory
    4. Within currency groups, process interpolation, including those in position augmentations

    It should be possible to do something like this for cost basis adjustments: Assets:Account -10 MSFT {34 USD} Assets:Account 10 MSFT {USD} Income:PnL 400 USD

    (See doc on Smarter Elision for a better version of this)

  • Separate inventory booking to be implemented in a plugin. It should do three things:
    • Find matching lots and raise errors when not found
    • replace all partially specified lots to their fully specified versions (they matched lots). For augmenting lots, this means insert the date. For lot reductions, it means, find the matching lot and use that instead of the partially specified one.
    • Insert links on matching lots, so that trades can be identified a posteriori.

    This means, move beancount.ops.validation.validate_inventory_booking() to its own file and make it do the three steps above.

(avg cost idea)

  • Docs for inventory booking: Add {* 634.23 USD} idea for average cost booking: there should be an optional amount, and the star just means “before and after”. Add this to B docs.
  • PROBLEM: You need be able to provide the cost with both and addition and a reduction, e.g. -2 HOOL {* 650 USD} ;; Should be possible even if current avg cost if 600 USD 2 HOOL {* 650 USD} ;; Means “add at this cost and then convert to avg cost”

    This is nice! The “*” now always means “after applying this operation, convert to avg cost.”.

  • You should add tests to ensure that an Inventory() can never have positions created with a cost of the same cost_currency as the currency. This should be enforced in the Position object itself.
  • After it’s done, merge back branch ‘sanscost’, and we should be able to make this work using the total cost value on the lots.
  • Implement a report of Trades booked in the list of filtered transactions! Trades should be automatically identified by the booking process, with its own namespace of links. Then allow producing suitable reports for trades.
  • (reports) Bring back the trades report into the mainline version, using inventory reductions.
  • (reports) We really do need to report on position reduction as TRADES. This is an important report to generate! This should be done separately from the improved inventory booking method.

    This report needs to include the long-term vs. short-term nature of those trades! The right way to do this is to run a separate plug-in that will add appropriate #long-term and #short-term tags or meta-data to those transactions, based on their booking dates..

  • Add the acquisition date to each lot, so that short/long-term can be calculated for the lot. The goal is to enable the automatic calculation and reporting of long vs. short capital gains.

Notes Copied from Full Booking Code

  • Testing:
    • FIXME: Come up with cases where we’re able to infer an AUGMENTING leg
    • FIXME: Come up with a case that would be ambiguous if not for the fact that one of the currencies already balances.

””” 2010-05-28 * Assets:Account1 100.00 CAD Assets:Account2 -80.00 CAD Assets:Account3 -20.00 CAD Assets:Account4 20.00 USD Assets:Account4 -100.00 CAD @ “””

””” 2010-05-28 * Assets:Account1 100.00 USD @ 1.2 Assets:Account2 120.00 CAD “””

  • TODO: Conversion from CostSpec to Cost
  • Make interpolation work off of Cost instances, not just CostSpec.
  • Notes:

    varieties:

    1. No cost, no price, with currency, e.g. Assets:Something 213.45 USD or Assets:Something USD This is obvious, it buckets into the units currency, i.e., USD.
    2. No cost, no price, no currency, e.g. Assets:Something This is an auto-posting. One of these should be replicated for every currency present in the transaction.

Postings with a price define their currency:

  1. No cost with price: Assets:Something 1000 JPY @ 120.0000 USD or Assets:Something 1000 JPY @ USD We use the price currency, e.g. USD
  2. No cost and no price currency: Assets:Something 1000 JPY @ In this case, we must consult the other postings. We look

Then, we have postings with costs, which also come in two varieties:

  1. With an explicit cost currency, e.g. Assets:Something 100 HOOL {12.23 USD} Or with missing amounts, e.g., Assets:Something 100 HOOL {USD} This clearly goes into the USD bucket.
  2. With no explicit cost currency, e.g., Assets:Something 100 HOOL {2014-09-30} Assets:Something 100 HOOL {“1b24b1151261”} Assets:Something 100 HOOL {} These are uncategorized.

    In order to resolve these postings to a specific currency bucket, we implement two heuristics:

    a) If all the other legs are of a single currency and there are no other uncategorized legs, this posting must also book against those; use that currency.

    b) Otherwise, look at the accumulated ante-inventory; if there is a single currency for it, the posting must be in that currency.

    Finally, if we aren’t able to resolve the currency of that posting using (a) or (b), fail interpolation/booking and skip the transaction.

With that algorithm, we should be able to automatically resolve stock splits that look like this, as long as the ante-inventory contains only lots in USD:

2015-09-30 * “Split” Assets:Investments:AAPL -40 AAPL {} Assets:Investments:AAPL 80 AAPL {}

Finally, note that postings with both a cost and a price must have a currency that matches, as constrained by the parser. If only the price or the cost is specified, we used that currency. Both a price and a cost may not be missing–that would leave two DOF to fill in.

More Old Notes

  • IMPORTANT FEATURE: Implement Average Booking for 401k adjustments, with associated tests and syntax in the parser.

    Update for inventory.py:

    def average(self): “”“Merge all lots of the same currency together at their average cost.

    Returns: A new instance of Inventory, with all the positions of each currency merged together at cost, bringing all these positions at average cost. “”” logging.warn(‘FIXME: continue here, this will be needed to report positions’)

    units_map = defaultdict(Decimal) costs_map = defaultdict(Decimal) for position in self.positions: lot = position.lot

    cost_currency = lot.cost.currency if lot.cost else None key = (lot.currency, cost_currency) units_map[key] += position.number pcosts_map[key] += position.get_cost().number

    inventory = Inventory() for lotcost_currencies, units in units_map.items(): lot_currency, cost_currency = lotcost_currencies cost_number = costs_map[lotcost_currencies] inventory.add(Amount(units, lot_currency), Amount(cost_number, cost_currency), allow_negative=True)

    return inventory

  • Change booking to always have lot-date and list trades automatically.
  • Report trades, all bookings should be findable after the fact using a link. This can be added without doing full booking.

Link Trades

  • Create a plugin that will link together all reducing transactions automatically. When a transaction reduces a position, both transactions should have a common unique link. This should likely be done by default, by the new booking method, because it already requires for the matching to happen in order to book lots properly. This is an important new typ9e of report to support!

Short Sales

Test short sales in-depth. (They work now, but I want more test cases.)

  • Allow short sales eventually. This should already work if all that you do is selectively suppress the validation check that verifies that a position at cost may not go negative. We could selectively suppress it by adding a flag to the open directive associated with an account, or maybe adding some special syntax in the cost specification that allows us to do this.
  • You can implement the sign check for positions held-at-cost only when there are other of that same commodity held at cost in the inventory in the opposite sign. This should allow holding short positions yet still retain the benefit of the check for data entry errors. It also removes what for most people will appear as a limitation from the docs (although with experience you would realize that it is not much of a limitation at atll).
  • Add tests for holding short positions.
  • Add tests for holding long and short positions in different commodities.
  • In order to relax the constraint that you may not add negative units at cost, we could only disallow under certain circumstances:
    • An account has received units in the opposite direction
    • If the posting cross the zero boundary. Maybe starting from zero in either direction could be fine.
  • Idea: Relax checks for negative values: from docs

    “PLEASE NOTE! In a future version of Beancount, we will relax this constraint somewhat. We will allow an account to hold a negative number of units of a commodity if and only if there are no other units of that commodity held in the account. Either that, or we will allow you to mark an account has having no such constraints at all.”

Cleanup, Deprecation, and Performance

These are a number of pending tasks to tidy up the codebase in various ways. Some of these are more impactful than others.

Core Cleanup

  • IMPORTANT: We need to test the return values of the Inventory.add_amount() method! Review and do this now.
  • Inventory: Implement a test for Inventory.get_amounts() with multiple lots of the same currency; they really should have been aggregated.
  • (Annoyance) All functions that return an (entries, errors, options_map) triple really should return (entries, options_map, errors), in this order.
  • Define a C extension module to implement D(). This function should catch useless errors from the cdecimal library: Declined: [<class ‘decimal.ConversionSyntax’>] and always provide the input string so we can debug WTF happened.
  • Reconcile all usage of account regexps to beancount.core.account.ACCOUNT_RE throughout the code (grep –include=’*.py’ -srn A-Z ~/p/beancount/src/python/beancount).
  • An easy way to remove the diff_amount exceptional field from Balance is to move it to metadata, and this would be consistent with the goal of plugins using metadata for their own goals: assertions of Balance can be see as a feature of the plugins.
  • Rename test_util.TestCase.assertLines() to assertEqualNoWS().
  • Review all the source code to use data.filter_txns() everywhere we can. The data.filter_txns() function has been created but the code hasn’t been converted.
  • Amount and Inventory and other basic classes: You could eventually support an implementation of __format__ which attempts to make sense of the different components, e.g., apply the format specifier to the number excluding the space required to render the currency.
  • (open directives) An invariant that we would love to have is to ensure that after parsing, all accounts that are used in a list of entries should have a corresponding Open directive for them. This would mean a variant of the validation routine that automatically inserts missing directives. At the moment, when an Open directive is missing, processing code that assumes they are always present might fail. We cannot insert the missing directives in the validation code simply because validation code is not allowed to modify the list of entries. We could insert a “fixup” step after validation, that does these kinds of automatic recoveries. Ponder this for a while.
  • (open directives) Do we need to insert Open entries for the equity accounts described in options? I think we could safely plop that at the very beginning of the entries list in the parser.
  • Document args of C functions in the same way as Python’s, perhaps using the new Python3 syntax definition thingamajig (I forget the name, there’s a PEP).
  • entries_table() really should be called postings_table().
  • Review all the code that is an effective switch/case on directive types and add checks for unknown directives. Make sure Commodity is being handled correctly. Grep for isinstance. Add else clause everywhere none was, e.g. https://bitbucket.org/blais/beancount/src/0e3be569f32a80411df8396d42d5e5ac3487a68f/src/python/beancount/core/realization.py?at=default#cl-292
  • Make all plugins accept a configuration parameter, unconditionally. The interface should be this regular. Right now, some of the plugin functions accept configuration, some don’t. This is easy and will make things more consistent.

Realization Cleanup

  • You need to convert some of TestRealization to TestCheck.

Install Cleanup

  • Don’t install all the _test.py files, make sure they’re not installed.

More Testing

  • In the parser or in the validation, check that the price currency matches that of the cost currency, if both are specified!

    2011-01-25 * “Transfer of Assets, 3467.90 USD”

    • Assets:Investments:RothIRA:Vanguard:VTIVX 250.752 VTIVX {18.35 USD} @ 13.83 CAD
    • Assets:Investments:RothIRA:DodgeCox:DODGX -30.892 DODGX {148.93 USD} @ 112.26 USD
  • Continue adding more of the pylint tests and make them pass. The codebase is almost at a point where it has all the tests from the default configuration.
  • Add a lint check that ensures the non-test files are never importing any of other test files.
  • You need to unit-test for multiline notes… do they work as expected?
  • Try to run the tests using ‘watchr’, ‘sniffer’, ‘autonose’ or other such tool.
  • Create special make target to run tests on my own large Ledger. This should bean-check, bean-roundtrip, bean-bake / scrape.
  • Implement a “fuzzing” input generator, that will output a very large input file with all possible kinds of combinations, to see where Beancount hits its limits and perhaps bring up some bugs from input I haven’t thought of. This is easy and fruitful.

Total Price Syntax Work

  • Document the @@ and {{}} syntaxes (see Matthew Harris email), especially as they relate to price.
  • You need to create a unit test for @@ price conversions.
  • As an aside, I see that the grammar supports @@ and {{}} syntaxes, but they don’t appear to be documented in the language manual.

    Oh I hadn’t noticed… I will document those, thank you for reporting this oversight.

  • When using @@ the signs should match; warn if they don’t.
  • Make prices required to always be positive, including with @@, and err on negative amounts for prices. This will match Ledger semantics and will remove one degree of freedom that wasn’t necessary.

Parser Cleanup & Improvements

  • You need to validate the account name options (empty, or no :, use regex to constrain).
  • Make booking the cost on the same currency as the instrument impossible:

    <account> 1212.023 USD {100.00 USD}

    Ditto w/ the price. This should only be done after allowing zero cost.

  • Add support for triple-quoted strings, which may alleviate problems we’re seen with syntax highlighting (idea from aumayr@).
  • What happens if you specify the same plugin twice, with different configuration strings?
  • KeyboardInterrupts (if you press it) in the parser will cause errors:

    $ bean-report $L exportpf > /tmp/export.ofx blais.beancount:42041: KeyboardInterrupt:

    This is exceedinly rare but maybe worthwhile to catch regardless.

  • It would be very useful to have #ifdef syntax support. See if this could be integrated with cpp by preprocessing the input. This would also take care of #include, potentially, and I might be able to remove that feature from Beancount itself. Just an idea.
  • (parser) Is it possible to specify no flag on a transaction?, e.g. just the date?

    2014-07-12 ..

    Does this work? It would be nice if it did. Make it so. This is more permissive and would ease the burden of conversion for users already familiar with Ledger. We should change the grammar so that the flag is part of the txn_fields. This is elegant: basically, instead of the flag taking the place of the transaction, the ‘txn’ keyword just becomes optional. That’s it. DO THIS!

  • Add an option to the parser to not just ignore unparsed lines, more strict.
  • BUG: A transaction like this fails to parse; allow it: 2014-02-22 * “Payee” | ..
  • BUG: This input file without currencies does not currently fail! This is due to the parser accepting incomplete input. It should fail!

    2014-01-01 * “Buy Hooli” Assets:US:Broker:HOOL 120.00 {1212.5100 USD} Assets:US:Broker:Cash

    2014-04-08 * “HOOLI INC CL C SPINOFF ON” Assets:US:Broker:HOOLA -120.00 {1212.5100 USD} Assets:US:Broker:HOOLA 120.00 {1212.5100 * 0.4992 USD} Assets:US:Broker:HOOL 120.00 {1212.5100 * 0.5008 USD}

  • When an error occurs while parsing a directive/transaction, add the ability to let the parser skip until the next directive and ignore the parsed transaction because of the error. Maybe this should be an exception mechanism, or just storing a flag that gets reset when the directive is completed. Not sure. This would be a more elegant way to deal with some errors.

File Cleanup

  • Remove plugins from ops, remove algorithms from core:

    (Also remove “documents” option and move that as input to this plugin.) b.ops.documents -> b.plugins.documents (not sure if breaks)

    b.core.getters -> b.ops.getters (does it add deps?) b.core.realization -> b.ops.realization (does it add deps?)

Prices Cleanup

  • Make the code finding lists of commodities for holdings and price reports use the same code as beancount.ops.find_prices. Also, this should really just be renamed beancount.prices.find_commodities and perhaps be moved up to beancount.ops.

    src/python/beancount/ops/holdings.py src/python/beancount/reports/price_reports.py

  • Remove the final traces of the “quote” metadata field from holdings:

    /home/blais/p/beancount/src/python/beancount/ops/holdings.py:153: Commodity directive from its ‘quote’ metadata field. /home/blais/p/beancount/src/python/beancount/ops/holdings.py:190: quote_currency = commodity_entry.meta.get(‘quote’, None)

    Nothing else uses this anymore, this needs to go too.

  • Issue warnings when fetching prices with dates that are too far from the requested dates. We need to find a way to issue a global tolerance for this, that indicates to the user to fill in missing prices that are required to carry out particular reporting tasks.
  • Price entries should have extra metadata to disambiguate between implicitly created prices, linking to the original transaction that created them, and explicitly created ones.

Example Files Cleanup

  • Convert example file to use beancount.plugins.ira_contribs plugin for mirror accounting.

Revise Dependency Graph

As I’m moving to a system with more plugins and less code in the core, and with the intermediate reports stage instead of just the web interface, it’s becoming clearer where some files need to move.

  • Make various attempts to simplify depgraph, we want to ship with a really lean dependency graph.
  • ops & plugins should not depend on parser…
    • Move beancount.parser.options to beancount.core.options
    • Move beancount.parser.printer outside parser, ideally, or just factor the dependencies separately.
  • If you want to be consistent with the script names, rename beancount.reports to beancount.report. This way, bean-* matches a single package name. Just saying.
  • Emerge a principle for where the following files should separate, or merge the two modules:

    beancount.ops.* beancount.plugins.*

  • beancount.core.realization: Look at deps for beancount.core.realization and move it upstream where it makes sense, maybe ops.
  • beancount.core.getters: Should this move to ops as well? Check the dependency tree, see if it makes sense.

Improve Balance Checks

  • An important option exists around the behavior of balance checks: Should balance checks bring their balance to the balance amount? In other words, should a balance check imply an automatic, on-demand Pad behavior on failure? This has an impact on following balance checks further in time. The current behavior is not to pad a failure, so if there are multiple balance checks, a single mistake before them would generate a number of errors. An equally valid behavior would be to trigger only a single error, isolating the periods between balance checks. Both semantics have advantages. You should provide an option to ustomize this behavior, the user should be able to choose. option “pad_failing_balances”.

Tagged Strings

  • Make tags and payees output “tagged strings”, with their own data types. You can derive from str. These new objects should behave exactly like ‘str’ but carry over their type. This would be in force for tags, account names, and currencies.

    You need to write some careful testing to figure out what happens on concatenation, etc. with these types.

  • If you do this, you can make the Custom directive not output pairs of objects for its ‘values’ attribuet; instead, you’d use the data type of the str object itself.

Improve Errors

  • Enhance error reporting! Make all errors possibly hold on to a list of entries, not just one. Many, many errors will benefit from this.
  • The creation of exceptions should be made easier: each error class should inherit from a base class that is able to accept an optional list of entries, that would automatically render the fileloc of each of those entries, and that would use the fileloc of the first entry in order to render the location of the error. If no entries are specified, an OPTIONAL fileloc= parameter should be provided to specify where the error occurs. This will make creating errors a lot easier and nicer.

    As part of this, we should also somehow produce a list of all possible errors with a lavish description.

  • Refine ‘source’ attribute on all directives: For .source, instead of ‘<…>’ for the filename, we should use a scheme:…, like file://…, and plugin:beancount.... . This makes a lot more sense. The lineno still needs to be separate, we need that for sorting and prefer not to have it part of the string.

Errors as Directives

  • (architecture) Seriously consider merging entries and errors; errors are just a special type of entry, and they have dates, and they get rendered in journals. This could make a lot of sense.
  • Interesting idea: Maybe instead of returning errors, “errors” could simply become “Error” directives and be inserted into the flow, and picked up by the various rendering routines in different ways?!? I love this. One less thing to return. Hmmm ponder it seriously.

Caching Improvements

Caching is emerging as a key feature because adding more plugins increases processing time. I’d like to eventually be able to cache as much processed data as possible, per file.

There are therre types of relevant caching:

  • Load cache: Caches the parsing and processing of a top-level Beancount file to a pickle.
  • Price cache: Caches previously fetched prices from external price sources. This allows us to re-run price fetching cheaply.
  • [Not Done] Document conversion cache: For importing new documents, some conversions are very expensive, in particular conversions from PDF. When there are problems, we end up having to re-run these slow conversions many times. Even without problems, we run them at least twice: Once for extraction and once for filing. We should build a conversion cache.

These need to have consistent sets of related options and file locations.

  • Create a ~/.beancount-cache directory and store all the cache information in there instead of in the different places they are now.
  • I really need a –no-cache and –clear-cache options across all the programs, consistently defined. When developing a plugin, it would be so much more convenient.
  • Add options to disable the cache in bean-check so that using -v would be useful when you’re trying to assess performance. I try this often enough I want options for it.
  • Use the same option on all tools for showing the timings, –verbose timings, maybe add it from the loader module.
  • (Performance) Great idea for performance: Implement the cache for each file separately. I could split my big file and only the bits which get edited would have to be recomputed.
  • All usage of environment variables should be removed by replacing them by the standard command-line options.

Performance

  • (validation/performance) Optimize the performance of validations and bring all the HARDCORE_VALIDATIONS in by default.
  • Maybe the builder should have a ‘filename’ state that only gets changed here and there instead of getting that fileloc argument passed in every time on every rule. Maybe we just always get the fileloc from the parser.c as in NUMBER. I think it might make the parser more efficient too… try it out, do timings, see how much it improves parsing performance.
  • See if you replace BUILD()’s PyObject_CallMethod to this how much faster it gets: “Note that if you only pass PyObject * args, PyObject_CallMethodObjArgs() is a faster alternative.” https://docs.python.org/3/c-api/object.html
  • (performance) Profile the web pages, if account_link() is high, provide an explicit cache for each unique view. (We had to remove this when we simplified the function using build_url for adding tests.)
  • (performance) Implement the stable hashing function in C and reinstall the validate_hash test.
  • (performance) Implement inventories in C and reinstall the validate_check_balances test.
  • (performance) Don’t pass in the FILE_LINE_ARGS on function calls, these should be part of the context of the parser, should be gettable only on demand.
  • Can I use Py_RETURN_NONE in order to incref and assign, in the lexer, instead of doing it in two steps?
  • Optimize the main update() routine that is called in display_context.

Parser Performance

  • Implement “D” in C, it’s worth it. This should make a substantial difference.
  • Test using the empty case of list parsing to create the initial empty lists instead of the conditional in Parser.handle_list() and measure, to see if there is a significant difference in parsing performance.
  • Parser performance: try not calling back to builder for simple types that just return their value; measure the difference, it may be worth it, and we wouldn’t lose much of flexibility, especially for the lexer types, which are aplenty.
  • Write the builder object in C… it won’t change very much anymore, and that’s probably simple enough.
  • Check the performance of D(). I suspect improving this routine could have a dramatic effect on performance.

Python Types & Namedtuple

Typed Data Types

  • Install proto3 and try to represent all the data using it; measure the difference in performance.

Hashing

  • See if we can remove Position’s __hash__ method now that it’s a namedtuple.
  • Also, look at all the objects in b.core.data, and see if you can override the hash function on them automatically in order to ignore the entry in postings, and the listness in entry.postings. It would be nice to be able to hash every directive type.

Portfolio Management

I’m somewhat automating the process of managing my own portfolio using Beancount. I export the contents of my assets to Google Finance using a script, and I have a way to upload my current list of holdings to a Google Spreadsheet, though I haven’t completely switched over to that. My plan is to ditch the GFinance export in favor of a custom spreadsheet, particularly because the latter can be entirely automated. My scripts will just update the list of assets in the spreadsheet and I should be able to derive various measures from that.

  • Complete the automation of uploading my portfolio to a spreadsheet and begin using that (over the Google Finance solution).
  • “Computing portfolio returns using Beancount.” Create a new document to describe the process, to describe how to do that.
  • Make sure that the list of holdings computed by bean-query is the same as that generated by the holdings report… Would it be possible to compute the latter using the former?
  • You should be able to export to input files or APIs for websites that track portfolios for you, such as Google Finance and Yahoo and others. Use the list of holdings as input. This should perhaps just be another report name.
  • Move the current portfolio code hacks out of my private repo and make that into the main repo.
  • Possibly create a ‘beancount.portfolio’ package and move all the there, validated and all, if there’s enough meat for it.
  • In holdings: create the concept of a “composition” which can be associated to any holding, based on the (account, currency, cost-currency), and which is a vector of proportions to be normalized and associated to the holding. You should then be able to compute the sum total of all compositions. This can be achieved with metadata, and it could be a generalized concept, with the following applications:
    • Liquidity (how easy is it to get money out of this account?)
    • Taxability (pre-tax, roth, after-tax, usually 0 or 100%)
    • Sector, industry exposures
    • Currency exposure

    Honestly, it would be even better if we don’t have to even do that and if we can do this by querying metadata and aggregating on the postings directly.

  • Compute pre-tax and post-tax net worth reports, based on a “tax” account metadata field and some reasonable assumptions.
  • Fetch the CSV holdings of each Holding and compute the full list of stocks I own from these ETFs in dollar value. Sort by larger to smaller. Also compute the industry with that. You need to write Vanguard download (harder, need to scrape), and iShares download (easier, CSV).
  • Implement the dashboard on the example file, take the code out of my private code stash and share.
  • Replace the portfolio script by a bean-query command that will use metadata on the commodities (!). This will simplify things a lot and be more flexible. These are really just aggregations of a different kind.

Dashboard

Add the following to the portfolio dashboard:

  • PnL since yesterday, one week ago, two weeks ago, one month ago, three months ago
  • Current portfolio breakdowns
  • Cash report
  • A listing of short-term lots vs. long-term lots
  • Schedule of lots to become long-term in the near future
  • Returns (computed correctly, over many periods)

From email to fxt:

I want to build an investment dashboard, that would contain:

  • List of holdings, with various rollups (see the different aggregations of holdings reports)
  • Rollups of holdings against various types (e.g., Stocks vs. Bonds)
  • P/L since the morning or for the last day, over the last week, last two weeks, last month, last quarter, last year.
  • Report of uninvested cash and the detail of where it is
  • A listing of short-term vs. long-term lots, and a schedule of which lots are going to switch from short-term to long-term in the near future (to avoid selecting those for sale)
  • Returns, as computed by my prototype of our ideas during the bicycle trip
  • Automatically refresh current prices if run intra-day

This script would run every hour on a crontab and generate static HTML files that I could access from my phone to make investment decisions and monitor gains/losses during the day. I’m certain you would appreciate having such a thing too. I want to integrate the code I already have out of beancount/experiments and start moving all this stuff to the beancount.dashboard.* and add unit tests and make it work on the tutorial file. This should make it easy for others to use.

Removal of Holding Object

At some point in Q4’2015 I overhauled the Posting object to rationalize its attributes. The result is a much simpler object, and one which could readily be used instead of the Holding objects I had defined in beancount.ops.holdings and beancount.reports.holdings_reports. I’d like to remove the Holding class to reduce complexity. This will make generating the reports for portfolios more straightforward: the list of assets simply should be the list of Postings accumulated at a particular point in time.

  • Remove Holding object everywhere and replace it by Posting. Convert all aggregation routines to work on Posting instances instead. {holding}
  • Replace Holding by the new, flattened Position, it’s a lot more like it. Makes a lot of sense now. Move the price_date to .meta, I don’t think it’s actually used.
  • In doing so, it might be useful to enrich the Price field of a Posting object with a date (number, currency, date). This would allow the creation of holdings at different dates. I’m not entirely certain we need it.
  • When attempting a conversion in holdings, if the rate isn’t available directly, you should always attempt to value it indirectly via an indirection through one of the operating currencies.
  • Add the acquisition date of each lot to each Holding, and it should be output at that date by print_holdings as well.

Old Notes

  • I think there’s a way to simplify holdings: you can probably remove the “Holding” type and replace that with a Posting, which really, is much like a Holding, it has an account and a position, and a price.... That would normalize Holding quite a bit, even if it means we end up adding a few unused slots to Posting. I’m happy to do that! Simplify simplify simplify… always.
  • Along with the new inventory, you can make Holding -> Position. This makes a lot of sense actually. Do do this!

(work on holdings)

  • Support output format “beancount” for holdings, use a single file instead of a holdings I/O file (merge holdings.csv + prices.beancount -> holdings.beancount) This would be much nicer.
  • Check holdings I/O by saving and reloading a list of holdings created from a set of entries (with sales, just to make sure).
  • In add_unrealized_gains(), convert to use our holdings aggregator.
  • Build a new category to portfolio to identify accountings holding “Uninvested Cash”, which should be cash available to invest now.

Computing Asset Returns

Another important task I’ve been trying to do is to process my investment history in order to compute the actual, precise returns, including all fees and taking into actual my particular cash transfers. Most brokers will provide either the generic returns of a particular instrument regardless of your exposure over time, or the difference between the current time and some past time for your entire portfolio. Neither of these is sufficient to evaluate your portfolio returns. I want to compute the returns of each instrument separately, entirely from the Beancount input file, and then combining those returns histories to compute portfolio returns.

In order to do this right, the structure of each investment has to follow some sort of pattern so that for each instrument we can identify external flows of money, internal flows, and “value accounts” which contain the commodities whose price fluctuates. I had a pretty good shot at this in the ‘returns’ branch, which has been merged, and have found some shortcomings, so I wrote the ‘returns2’ branch, but it is yet incomplete and I have to restart working on this at some point.

  • I need to add detailed debugging information in order to troubleshoot the cases where it does not work well. There are several corner cases and I’ve been having a difficult time identifying exactly why it fails for some commodities. Time to write some nice debugging output, this will be useful later, and for others, and for others to share when asking questions if they face similar difficulties.
  • In order to compute this properly, I will need to pull prices at various points in time. I want to automate the process of figuring out the dates at which prices are needed, i.e., when the prices in Beancount at too far from the evaluation dates, and perhaps automate the fetching of all necessary prices.
  • Bring in all the generic functions from experiments/returns/returns.py into core beancount. Bring in returns as a plugin.
  • Prices: Write a script to output the timeline of prices/rates missing & required in order to compute all the returns correctly. Then use it to drive fetching a historical table of monthly or perhaps weekly exchange rates for USD/CAD, USD/AUD, EUR/USD since the beginning of my file. Make this script reusable.
  • Fetch missing prices for my input file and recompute returns.
  • Returns calculation should spit out missing prices. Automate returns calculation.
  • Fix FIXME in beancount.projects.returns, from DClemente issues.

Net Worth

A complement to the portfolio management features is the ability to report on the total “net worth” on the balance sheet, over time, and as pre- and post- tax amounts.

  • Make the net-worth-over-time script output post-tax networth as well, using metadata. In other words, the post-tax amount should represent the liquidation value of the portfolio if all tax liabilities had to be incurred (e.g., taking all the money out of tax-deferred retirement accounts). This should use a percentage meta-data field in order to figure out the approximate taxation rate, e.g.,

    2014-01-29 open Assets:US:Vanguard:PreTax401k tax: “pre”

    Allowable values should be:

    “pre” : For regular tax-deferred accounts, where the entire amount is taxed on distribution. “after” : For after-tax 401k accounts where only the gains are taxed on distribution. “roth” : For Roth accounts (not taxable on distribution). None : For taxable accounts.

  • Make the net-worth script into an official project, don’t just leave it under experiments. Do this only after the removal of the Holding object, to minimize changes.

Standalone Tools

In order to produce many of the reports, I’ve had many ideas of composable command-line tools to build and provide for processing text with account names. These are small standalone projects.

  • Build a function and command-line tool that can injest either a table of results or a CSV file and infer that an entire column is of numbers and pairs of numbers and can accordingly split the column into multiple columns and put the currency in the header so you can import that up to a spreadsheet.
  • Build an ‘statement’ tool that will render a treeified balance sheet in two columns! Limit it to use the beginning of a line, and hard-code to use the five known categories (optionally changeable). It’s okay if the tool is a bit more limited than treeify. It should optionally do the treeification. It should also optionally sort the account names (or not).
    • Add a –title option to render at the top.

    (A two-column tool to convert one column into two columns (for text mode balance sheet and income statement). The equivalent UNIX tool does not exist. Select columns by regexp on prefix.

  • Build a simple ‘colrneg’ tool that just highlights numbers as green or red depending on if positive or negative.
  • Perhaps should build a version of treeify for internal usage that works on HTML columns, off of HTML text. Or BETTER: just a stateful tool that can transform an account’s name to indent it properly every time you feed it the full account name! This could be used by the routine that want to render columns as tree. Maybe ‘treeify’ should use that as well. That would make a lot of sense.

Streamline Commands

As features grow, so do commands. I’d like to minimize the number of them, where it makes sense. I find it’s often possible to do that.

  • I plan to remove bean-example and fold that into a bean-doctor subcommand.
  • Maybe bean-bake could be folded into bean-web.
  • bean-sql is a bit of an experiment, I’m not sure we need it, but I want to keep the functionality. Maybe this should only be a subcommand of the code that provides Beancount postings as a a virtual SQLite3 table.

Debits & Credits Normalization

One simple idea that might make at least some people happy is to make it possible to input the great majority of numbers as positive numbers. This can easily be done by considering the account type to which a number is attached and invert the number, if necessary. For example, if this input mode is enabled, you would input a posting to Income as a positive number, but it would be interpreted as a negative number.

  • Allow sign normalization:
    • Add an option to the parser to allow signs to be entered with the “all positive” convention, and actually invert the signs right at the output of the parser. Balance errors should be enhanced to emphasize which of the postings should be increased or decreased, based on the sign of the balance error and the type of each account.
    • For display, in the shell, provide a SIGN(account) function that allows the user to multiply the inventory by, or a NORM(inventory, account) function that would do that itself on the inventory.

    This whole thing should be a minor version. This would be a valuable feature IMO, allowing users to choose their favorite convention would be a plus.

  • Do support rendering options to invert the amounts of the minus accounts. This is an important feature.
  • The new balance sheets should be able to invert the numbers (and then they should get rendered differently). Basically, every number shown should be either in signed or cr/dr format. We should be able to switch between the two at render time. This should work across all number-rendering routines everywhere–do this centrally.
  • In the balance sheet and income statement, we need to render the amounts inverted (and in a slightly different style).

Sanity Check For Conversions

  • Insert a validation check when transferring amounts to the balance sheet that the implied rate of the conversion entries is within certain bounds of the price, for each pair of commodities (find a way). These bounds should be proportional to the variance of the price. This would just provide an extra amount of good fuzzy feeling, knowing for sure that my solution to the conversions problem is always meaningful and correct.

Add an “End Balance” Directive

Some people are requesting and end balance directive.

Improve the Include Directive

  • Support an include directive that is a URL, in order to fetch lists of prices updated remotely, or via crontab. This way the dashboard does not have to include code that fetches prices.
  • Idea: an include directive should have a “prefix” option, to add a prefix to all accounts from the included file.

Improvements to Price Fetching

  • Infer the rendering precision for inverted currencies from only the list of previous prices. Run a DisplayContext on those and do similar quantization/rounding automatically. This would make a lot of sense, especially when inversion generates a very large number of digits.
  • Write a script that will iterate and fetch all the missing dates on Friday, or last DOM for the required portfolio during the entire duration, or allow providing a frequency string (does rrule have one?) and let bean-price do this natively. This will work the API.
  • You have to deal with the case whereby two currencies provide incompatible conversions of each other. This would potentially result in colliding entries in the price database. Not sure how to deal with those yet.
  • Write a script that will automatically fetch the dates I held various positions at cost for throughout the history and a list of weekly dates to fetch rates for.

OANDA

  • Finish the implementation of my delayed rate fetcher from OANDA.

Will & Legacy

One of the tasks I’d like to Beancount to be able to do is to produce documents which list the account details of assets and liabilities, including account number, institutino address, phone numbers, contact persons, etc. in order to encrypt and share with a trusted one, in case I pass away.

  • Create a new type of report that produces a readable document with the entire list of accounts and descriptions and account numbers pulled from metadata. This document should be attacheable to a will, to describe all the accounts, institutions’ phone numbers, account numbers, in a way that makes it possible for someone executing a will to easily understand that full nature of the assets and how to reach the relevant institutions to liquidate the assets.
  • In order to have someone else be able to take care of your business, you should be able to produce a list of the accounts open at the end of the period, with the account ids and balances. This should be printable and storable, for someone else to take care of your things in case you die.

Gains Sans Commissions

How do I take into account the commissions and fees adjustment on the cost basis for a position? You need to be able to accommodate commissions as not being part of the gain.

BRANCH: ‘sanscost’

  • This work depends on the completion of the ‘booking’ branch.
  • Capital gains should not count commissions nor on the buy nor on the sell side. How do we book them like this? Can we count this somehow automatically? Misc accounts? Not sure.

Normalized Credit & Debits

One idea is to allow the user to work in terms of credits and debits, using all (largely) positive signs for all postings.

  • Make this possible in the input, switch the signs in the parser.
  • Also render the numbers with normalized, positive signs only.
  • Make it possible to render the credits and debit numbers separately, putting them in the correct column based on their account type AND sign.
  • Rendering only: Color the background of numbers with an inverted sign (e.g. payments in a liability account) differently! There should be modes to rendering balance sheets and income statements with inverted amounts, and it should all be done client-side. When amounts are rendered as credits/debits, color their background distinctly, so that it’s obvious what kind of sign convention is in use.

Total Balance

Currently, the Balance directive only asserts the currency of its attached amount. Some users have requested a balance check that is exhaustive, that is, which asserts that the inventory matches the contents of a directive.

  • Implement some parser-level, generic directive for specifying the contents of an inventory (units only).
  • Implement a total balance assertion using the following syntax:

    YYYY-MM-DD balance account amount account amount account amount account amount

    The distinction is that it’s on multiple lines. Maybe call it balance*.

  • A variant on this would be a balance directive which asserts the cost basis of an account, either by currency, or total. Perhaps this could be useful, and if anything, easy to implement. See past discussion with Eric Weigle on the ledger-cli mailing-list.
  • When you will add cost basis to the balance assertions, make the padding directive also able to fill in with some cost basis. This would be useful for mharris (see discussion on Language Syntax document).
  • Create a new directive for balance that checks for the complete balance. Ideas for syntax:

    2014-06-20 balance Assets:Some:Account 10 HOOL, 640.40 USD FULL 2014-06-20 balance Assets:Some:Account [10 HOOL, 640.40 USD] 2014-06-20 balance Assets:Some:Account <10 HOOL, 640.40 USD> 2014-06-20 balance* Assets:Some:Account 10 HOOL, 640.40 USD 2014-06-20 full_balance Assets:Some:Account 10 HOOL, 640.40 USD

    Maybe we should define a general syntax for input’ing an Inventory object, that could be read at parsing time.

Wash Sales

Right now [2016-04-20], there is an experimental plugin that’s there to remove the commissions from the P/L (experiemnts/washsales/commissions.py). This is then used by two custom scripts (experiemnts/washsales/list-wash-sales.py) which uses metadata from the postings marked to be washed by the user, and all the “washing” takes place somewhere in that script, not in Beancount. There is also another script (experiemnts/washsales/list-lots.py) which is used to list the resulting lots.

  • Document all cases of wash sales and support arbitrary washing of sales in an official Beancount plugin.
  • You need to make the washing of the lots a part of regular Beancount, the washing of the lots needs to occur within Beancount, not in the list-wash-sales.

    For example, this transaction:

    2015-05-19 * “Sold some” ^c2fbec103d99 ref: 036 Assets:US:Broker:HOOL -X HOOL {538.6500 USD} @ 552.4460 USD Assets:US:Broker:HOOL -X HOOL {577.5400 USD} @ 552.4460 USD wash: TRUE Expenses:Financial:Commissions 0.19 USD Assets:US:Broker:Cash XXXX.XX USD Income:US:Broker:HOOL:PnL

    Should be converted to this one automatically:

    2015-05-19 * “Sold some” ^c2fbec103d99 ref: 036 Assets:US:Broker:HOOL -X HOOL {538.6500 USD} @ 552.4460 USD Assets:US:Broker:HOOL -X HOOL {577.5400 USD} @ 552.4460 USD wash: TRUE Expenses:Financial:Commissions 0.19 USD Assets:US:Broker:Cash XXXX.XX USD Income:US:Broker:HOOL:PnL Income:US:Broker:HOOL:PnL YYY.YY USD Income:US:Broker:HOOL:Adjustments -YYY.YY USD

    The list-wash-sales script ought to be a simple gathering of the already washed sales.

  • Listing the lots in a particular account at a particular date should be carried out not with a custom experiments/washsales/list-lots.py script, but rather by writing a simple SQL query. This requires the ‘booking’ branch to be completed, and the CSV output from bean-query to work as well. Do this.
  • Enumerate the various cases for wash sales, write a nice document for them, and write example cases for each, to be referenced from the document. Solve this completely, using metadata and a plugin.

I wrote a script to wash losses for entire lots. This still needs more work: ideally, one should be able to insert an arbitrary cost basis adjustment for each lot. Do this, and integrate with the wash-sales-tracker implemented by bbreslauer.

  • It should be possible to produce input for the wash sales calculation, to then insert metadata or some other set of postings on each lot, and to produce output suitable to be printed and sent to the IRS.
  • There is already a plugin which allows the user to wash away the P/L of entire lots under experiments/washsales. Enhance that to support arbitrary adjustments and move that into ‘beancount.plugins’.

Stock Splits

Beancount currently does not deal with stock splits, but you can deal with it yourself. Essentially, you have to write a single transaction which empties out your positions and recreates them at their new cost basis. You process could be automated, probably the best way to carry this out would be to write a plugin that provides a new custom directive.

  • Find a way to automatically create multiple currencies to account for stock splits. The user should not have to do this themselves:

    “One choice you have is whether you want to keep the same symbol pre and post split. If you keep the same symbol the price database will show a drop in price over time. If not, choose a new symbol for the post split period. So far in my own file I use the same symbol but I think eventually I’d like to find a way to differentiate them automatically and have a multiplier in the price database, because doing it manually is not nice. In any case both methods work.”

    Perhaps a directive could be created to declare them that would automatically insert converting transactions, but that makes it difficult to deal with weird cases, like oddly unequal splits (e.g. Google A/C), or cases with spinoffs, as below. Perhaps the splitting transaction should be inserted automatically, but that the split directive would just create a new currency name, internally.

    A currency would then be referred to by a pair of (name, date), and that would resolve to a internal currency name.

  • (IDEA) In order to create suitable stock split entries that would look like this:

    2013-04-01 * “split 4:1” Assets:CA:ITrade:AAPL -40 AAPL {{5483.09 USD}} Assets:CA:ITrade:AAPL 160 AAPL {{5483.09 USD}}

    You could easily add support for a directive that looks like this:

    2013-04-01 split Assets:CA:ITrade:AAPL 4:1 AAPL

    This would allow the user to do some processing specific to stock splits by processing the explicit stock split entries.

  • Include this in the user examples, + stock splits:

    2013-04-01 * “name change”

    Assets:CA:ITrade:AAPL -40 AAPL {{5483.09 USD}} Assets:CA:ITrade:NEWAAPL 40 NEWAAPL {{5483.09 USD}}

    2013-04-01 * “spinoff” Assets:CA:ITrade:KRFT -100 KRFT {{20000 USD}} Assets:CA:ITrade:KRFT 100 KRFT {{17000 USD}} Assets:CA:ITrade:FOO 20 FOO {{ 3000 USD}}

  • Because of the way we currently deal with stock splits, allow a list of commodity names on the commodity directive, so you can do this:

    1998-01-01 commodity CRA,CRA1 name: “Celera Corporation” asset-class: “Stock” ticker: “CRA” quote: USD

Sharing Expenses

Producing Journals as Tables

  • Figure out some way to report the names of the other accounts of a particular account. For example, render a journal as a detailed expense report, for a set of accounts (e.g., Expenses:*) pulling out amounts in various columns based on other expressions (e.g. Assets:Cash:Caroline).

    This would be an approximation:

    bean-query $L ” SELECT date, account, maxwidth(description, 25), convert(position, ‘USD’) FROM has_account(‘Caroline’) WHERE (NOT account ~ ‘Caroline’) AND (NOT account ~ ‘Rounding’) AND account ~ ‘Expenses:’;”

    However it doesn’t work so well. The description isn’t good enough. A better approach would be to just provide a special which lists all the other accounts in a single string.

Book Unrealized Gains Correctly

  • IMPORTANT: Unrealized gains for opened periods should show only gains since the openings. In other words, unrealized gains should be realized marked-to-market at the time of open.
  • Unrealized gain when rendering for closed years does not appear. Perhaps we should insert the unrealized gains during close operation.

    Idea: close realized gains along with close(), so that they don’t show up for the latest year.

  • Unrealized gains should be modified so that they replace the book value of the positions that they adjust, and can be applied at multiple dates. Then, the realization should automatically occur both at the beginning and end of reporting periods.

Old Notes

  • Unrealized gains should not be added if the gain is zero.
  • There’s a fundamental question about which date to be used for pricing entries. This really would depend on the view. If this is a period view, the date of the last entry is most appropriate. If it is any other kind of view, the latest price is best. All the reports should be adjusted for this.

Old Notes

  • Unrealized capital gains could be inserted automatically into special sub-accounts, based on the current price and the cost-basis of particular accounts. This could be inserted automatically! e.g.

    DATE check Assets:US:Ameritrade:AAPL 10 AAPL {200 USD}

    DATE price AAPL 210 USD

    Assets:US:Ameritrade:AAPL 2000 USD Assets:US:Ameritrade:AAPL:Gains 100 USD

    The “Gains” subaccount could be inserted automatically if the price differs from the cost basis… this would be a clever way to represent this! We could even do this by inserting a transaction automatically with an offsetting account… actually this would be the RIGHT way to do this!

    We need an option to designate which subaccount leaf to create all the new transactions for:

    %option account_unrealized “Unrealized”

    2013-05-23 ‘A “Booking unrealized gains for AAPL” Assets:US:Ameritrade:AAPL:Unrealized 230.45 USD Income:Unrealized -230.45 USD

    By doing this, the reporting does not have anything to do… it can choose to report positions at cost or in held units, and whether the gains are included or not entirely depends on whether these transactions have been inserted in or not.

Pivot Reports

An interesting reporting idea which comes back again and again is the ability to produce a year-on-year or month-by-month summary of income and expenses, as a table. This is essentially a pivot table where the date (year or month) is rendered as the X axis, and accounts used on the X axis. Account balances are aggregated in the cells.

  • Implement PIVOT reports directly from the shell. These are oft-requested.
  • Including RSP contribs, like my big spreadsheet that I crafted manually? Can I do that? That would be awesome!
  • ‘csv-pivot’: build this: a script that can accept a CSV file and render a CSV pivot table from it. The reason we need support is in order to carry out operations on columns of inventories. Maybe we should impleemnted some sort of swiss-knife tool that is able to parse inventories from columns and perform various operations on them, aggregations, etc. using Beancount’s Inventory() class. This could be a powerful tool! Make it possible to parse and create Inventory objects from cells.
  • One kind of report that would be GREAT is a single grid with all income accounts on the left with year by year on the horizontal. An overview of all the years. Same with month-by-month report.

Use Metadata for Pivoting

  • One idea Ledger uses well is the ability to associate key-values meta-data to transaction objects, a-la-Common Lisp. See the –pivot feature. It seems a bit superfluous at the moment, but may be useful in order to provide the ability to implement custom aggregations eventually, instead of using the strings. Maybe the payee could be a special case of this, e.g payee=”value”

    (From mailing-list):

    Take this example:

    2011-01-01 * Opening balance Assets:Cash 25.00 GBP Equity:Opening balance -25.00 GBP

    2011-02-01 * Sell to customer AAA ; Customer: AAA ; Invoice: 101 Assets:Receivables 10.00 GBP Income:Sale -10.00 GBP

    2011-02-02 * Sell to customer BBB ; Customer: BBB ; Invoice: 102 Assets:Receivables 11.00 GBP Income:Sale -11.00 GBP

    2011-02-03 * Sell to customer AAA ; Customer: AAA ; Invoice: 103 Assets:Receivables 12.00 GBP Income:Sale -12.00 GBP

    2011-02-03 * Money received from customer AAA for invoice 101 ; Customer: AAA ; Invoice: 101 Assets:Cash 10.00 GBP Assets:Receivables -10.00 GBP

    Now you can see how much each customer owes you:

    ledger -f d bal assets:receivables –pivot Customer 23.00 GBP Customer 12.00 GBP AAA:Assets:Receivables 11.00 GBP BBB:Assets:Receivables


    23.00 GBP

    And you can see which invoices haven’t been paid yet:

    ledger -f d bal assets:receivables –pivot Invoice 23.00 GBP Invoice 11.00 GBP 102:Assets:Receivables 12.00 GBP 103:Assets:Receivables


    23.00 GBP

Reporting to Single Currency

Another oft-requested feature is the ability to boil down all assets to a single target currency, converting to cost, and then to the final currency. The point is to show a single column of value which can be operated on.

Plugins

I’ve had various ideas for writing new plugins, some make sense, some don’t, but in any case, these accumulate here.

  • Would it make sense for every plugin to provide a validation function? We could then move all the validation routines in their plugin file. I very much like this idea: it creates more isolation for routines and less dependencies. Open/Close, Balance checks, do seem to be able to fit in this category. Those functions should return only a single list of errors, no entries, and the calling function should perform a simple hash check to ensure that the mutable portion of the entries hasn’t been modified by the user-provided validation functions. ‘beancount.ops.documents’ could benefit from this split.

Plugin: sellgains

  • This would probably benefit in being renamed to ‘check_gains’, a better name.

Plugin: flagged_subset

  • Idea: Write two plugins…
    • One that check that all the postings with a particular flag on them balance to zero.
    • One that forks out all postings with a particular flag to a separate transaction.

    This is from an email thread with redstreet0 in Jan 2015:

    I said: ”

    • Define yourself a special account under a common base, e.g. Equity:Extra:* for all those special accounts.
    • Write a plugin that will ensure that for all transactions that include at least one posting on an Equity:Extra account, the sum of all the weights of these postings is zero.
    • If you want to automatically fill in missing postings these accounts, you can also do that from a plugin.
    • Your plugin should be configurable with the root account you want to make special in that way, in this case “Equity:Extra”. See other plugins for how to pass in a configuration.
    • You can optionally filter out all those Equity:Extra:* postings in the reports using the FROM syntax. Otherwise the detail of the Equity:Extra accounts in the balance sheet will be pretty harmless anyhow, but you could remove it.

    Note that instead of identifying these special postings using a known root account, you could instead trigger that capability by using posting flags.

    (Also, note that if all you care about is the balanced virtual accounts, that’s entirely equivalent to a second transaction on the same date. I could be convinced to add that in, a special state for a subset of postings, as a “shadow transaction” whereby the parser splits the single transaction into two separate ones, perhaps adding a tag to the shadow one so that you can filter them out at will. That could be implemented as a plugin, BTW, separating postings that have a particular flag on them.)

    ”. This could be used to simulate balanced virtual postings.

Plugin: alt_date

  • Create a plugin that allows you to replace the date with some of the metadata fields, e.g. to create alternative date histories.

    “Note that if you really badly wanted alternative history, you could you could easily enter alternative dates as metadata (Beancount will recognize and parse a datetime.date type as a value for metadata) and you coudl write very simple plugin that converts all the transactions to use the alternative date where present in the metadata (or otherwise leave the date as is). You could even define yourself multiple different sets of alternative dates by using different metadata fields… you can go crazy if you like and create multiple versions of history that way. But that would be segregated to a plugin so I’m comfortable with it, do whatever you like in plugins, they’re perfect for experimentation.”

Plugin: unverified

  • Idea! Allow the selection or reporting of all the postings since their balance check in each account. These postings can be called “unverified” and it should be possible to report just those. Maybe we can restrict further to the list of those without a ‘*’ flag, or maybe just those with a ‘!’ flag. “Show me all that’s unverified right now.”

    Can this be accomplished with the shell? I suppose a new plugin could be created to flag unverified entries with meta-data and then filter on that. That’s probably the best way to do this.

Plugin: no_zero_amounts

  • Move the check for zero units (“Amount is zero” from the parser) to a plugin, and make this selectively removable.

Plugin: no_unused_pad

  • The validation check that pushes an error on unused pad directives should be moved to a plugin, that should be optional. There is rationale for allowing to keep unused pad directives. Don’t be so strict, Martin.

Plugin: init_pad

  • Idea: a plugin that autopads all initial balance assertions! Do it for demos, will be very useful for making demos easier, not having to be so strict.
  • Write a plugin that automatically inserts a padding directive for accounts with no open directive and with a balance check.

Plugin: multi_pad

  • New plugin type: a kind of spreaded Pad directive, that creates multiple pads at regular intervals. This is to deal with smooth cash distributions or work meals assignment. You should be able to specify the frequency and have it automatically insert a number of entries to spread the expense evenly. ‘evenpad’, ‘multipad’, ‘distribution’? This should most definitely be a plugin.

Plugin: cost_pad

  • (pad) Review the possibility of padding units held at cost:

    “The reason it fails is that there must have been units of those commodities held at cost before the pad date, and it is an error to pad commodities at cost, because Beancount has no way to know what the cost basis of those commodities should be.”

Plugin: auto_remove

  • Add a auto-remove-unused part of the auto_accounts plugin, that automatically removes Open directives for unused accounts. This is useful for demos and such.

Plugin: monotonic

  • Write a plugin that enables a check that all postings’ amounts are of the correctly allowed sign, e.g. Expenses should almost always be a positive amount, Income a negative one, Assets can be both, also Liabilities. If an amount is posted in the contra direction, this should trigger a warning, unless the transaction is flag with a particular character, or some special tag is present.

Plugin: match_pending

  • Implement beancount.plugins.tag_pending as a general feature of links… this ought to be built-in by default, this is a great idea.

Plugin: balance_constraint

  • Idea: You could add a further constraint property to an account name: that the amounts may never be allowed to balance to a particular sign. This could be useful to avoid data entry mistakes. You could even write a doc just focused on all features designed to avoid data entry mistakes.
  • Create a verification plugin that verifies that the balance of an account does not go under some negative threshold below/above zero. This way you could check that account balances are of the expected size. (The plugin should accept transient negative balances within the day though, as those are order-dependent.)

Plugin: tip_calculator

  • Write another example plugin that splits “Expenses:Restaurant” legs into two two postings: “Expenses:Restaurant x 83%” and “Expenses:Tips x 17%”.

Plugin: gifi_reporting

  • You could write a script to automatically fill this form: http://www.cra-arc.gc.ca/E/pub/tg/rc4088/rc4088-12e.pdf

    “With Beancount, one thing that would be doable outside of Beancount, as a separate script, is to associate a set of accounts to these GIFI codes and automatically generate the forms.”

Plugin: sans_cost

Capital Gain Without Cost.

  • Implement the proposal for putting the capital gain in the cost as a plugin that transforms the relevant transactions, those tagged as such. This will require some loosening of the booking method in order to make it easier to disambiguate a sale, and some good debugging tools as well.

    You could automatically look for the right amounts by looking at the signs. I think you could automate a lot of it.

  • Compare with what I’ve done for wash sales for TY2015.

Plugin: avco_checker

Strict Average Cost Inventory Booking Checker.

  • Build a plugin that will check that accounts with average cost inventory booking method only have such reductions in them:

    https://docs.google.com/document/d/1F8IJ_7fMHZ75XFPocMokLxVZczAhrBRBVN9uMhQFCZ4/edit#heading=h.m74dwkjzqojh

    “Another approach would be to not enforce these aggregations, but to provide a plugin that would check that for those accounts that are marked as using the average cost inventory booking method by default, that only such bookings actually take place.”

    This is, of cource, to be implemented only after implementing support for the average cost inventory booking method.

Plugin: match_pending

  • Write a plugin that will pair up (link) matching transactions posting to a common account; this could be used to properly pair up the transactions to Liabilities:AccountsPayable, which would be very useful.

Plugin: at_cost_only

  • Idea for one of those constraint plugins: Create a plugin whereby you declare some currencies and assert that any units in them are always held at cost.

Plugin: subset_balances

  • Create a plugin that filters out postings with a particular per-posting flag, and that checks that they are balanced by themselves.

Super-Plugins: Pedantic & Auto

There are two groups of plugins which I use quite often, and would benefit from it being possible to enable all at the same time: “auto everything” and “super pedantic.” Ideally the list of actual plugins being run would be expanded in the loader. I think we ought to just make the loader support modules in its __plugins__ attribute, and it would insert them this way in options_map, and run them individually, or maybe use another module attribute, e.g. __plugins_delegate__ or whatever..

  • (pedantic) Create a new plugin-of-plugins which enables all the possible constraining other plugins in order to make Beancount as tight as possible by default. Making the hardcore validation into a plugin might be a good way to provide a strict mode. You could build a “beancount.plugins.pedantic” plugin-of-plugins.
  • (auto) Create a new plugin-of-plugins which enables all the auto-declaration plugins in order to make crafting little test files as convenient as possible. I do this all the time, I need this actually. You could build a “beancount.plugins.auto” plugin-of-plugins.
  • Put all verification plugins, including nounused and noduplicates under beancount.plugins.constraints.*.

Payees as Subaccounts

Payees whose transactions always post to the same expense accounts may be viewed as sub-accounts of those accounts, in some sense. I’ve been toying with the idea of taking advantage of this in various ways. For example, one could break down the balance of an account by payee, or conversely, use the leaf account name as the payee itself (and remove it). I’m not sure if this useful. This would also require some clean payee names. This section contains some ideas about that.

  • Idea: Allow sub-account names to include a special character, e.g., ‘#’, (only one) that would indicate to the reporting facilities that, by default, the aggregation should be reported to the parent account. A “detail” or “verbose” switch could be used to trigger the detailing of subaccounts. For example,

    Expenses:Health:Medical:#Claims Expenses:Health:Medical:#PatientSavings Expenses:Health:Medical:#ClaimsPayments

    would be reported as

    Expenses:Health:Medical

    by default, but with the detailed switch, would be reported as

    Expenses:Health:Medical:Claims Expenses:Health:Medical:PatientSavings Expenses:Health:Medical:ClaimsPayments

    This could be used for various subaccounts actually. It’s a nice way to guide reporting that does not complexify the semantics.

  • Idea: in the query language, provide a special Account:Payee field, in order to play with the notion of payee-as-subacccount often discussed.
  • About the discrepancy between the concept of “Payee” and a superfluous lead account, e.g. Internet:TimeWarner, which typically contains only transactions from that payee: maybe we can elide the account name if it contains only a single payee, or perhaps a warning may be issued? I don’t know.

    Basically, it would be nice to be able to have multiple payees in the same category over time (e.g. Electricity, Internet) but to be able to separate them somehow, without having to put the payee into the name. This is a little fuzzy, and I’m not sure how to do it, because the imported payee names are often not very clear and often truncated as well.

    Have you ever thought that Payees often end up functioning like an extra subaccount? I’ve come to realize that for Payees that only ever touch a single account, the line is really fuzzy there. I’ve been entertaining the idea of automatically creating subaccounts for payees like that.

  • Write a script that will highlight some “payee vs tags vs subaccount” invariants:
    • Highlight payees that are always used with the same accounts
    • Same with tags

Old Notes

  • Create a plugin that will define subaccounts for payees within accounts and modify all the transactions accordingly. This would be a great way to kick the tires on this idea without affecting the rest of the system.

    Expenses:Electricity:ConEdison Expenses:Phone:TMobile Expenses:Groceries:WholeFoodsMarket Expenses:Groceries:UnionMarket

  • Should we define some notion of the default level for aggregation, per account? For example, in Expenses:Electricity:ConEdison, the default level of aggregation should be Expenses:Electricity. If we define that, using subaccounts should not bother us much.

Budgets & Temporal Constraints

The concept of “budgeting” should be implemented as a list of constraints on top of what Beancount already offers. It could be segregated to a plugin, as long as the input syntax is powerful enough to support it. Some people in the ‘fava’ team have already begun experimenting and that’s why I added support for a Custom directive. However, eventually this should be formalized a bit more and unknown directives should be let to trickle through, and for plugins to either register a validation function for the expected data types that were seen (this wouldn’t be grammar-level parsing, would have to be after the grammar has run) or for them to validate their matching directives directly.

  • It would be useful to create a directive that checks that the balance of an account in a time interval is lesser or greater than some specific amount, e.g.

    Expenses:US:TY2014:SocSec <= 7254 USD ;; 6.2% of 117,000 USD Expenses:US:TY2015:SocSec <= 7347 USD ;; 6.2% of 118,500 USD

  • “Envelope budgeting” is easily implemented like this:
    • A sub-account of a real assets account is market as special.
    • Those sub-accounts are used to transfer funds from their parent account. They are constrained to only ever transfer funds to and from that parent account.
    • Their transactions can simply be filtered out in order to remove the budgeting aspect.
    • You should be able to report balances on these accounts. In fact, since they’re marked as special, you should be able to gather all of them and report the balances of all the envelopes. That’s the summarization for the budgeting function.
    • Finally, some level of automation should be provided to automatically replenish the enveloped at some set frequency (e.g. the beginning of each month), including rolling unspent amounts.

Plugin: budget

Budgeting / Goals

  • We could easily add features for budgeting, e.g. somehow set goals and then report on the difference between actual and the goal, and project in the future according to rate, e.g.:

    ;; Check that the total for this position between 2013-06-01 and 2013-12-31 < 800 USD 2013-12-31 goal Expenses:US:Restaurant 2013-06-01 < 800 USD

    ;; Check that the balance of this account on 2013-12-31 is >= 1000 USD 2013-12-31 goal Assets:Savings >= 1000 USD

Filtering

Views

  • Replace all views by filtering queries… the root page should still have convenient links to various preset views, like the last five years, but these links should be implemented using the filtering query feature! Maybe it’s worth allowing the user to specify common queries in the options map, and provide links to them. Do this, and try removing some of my subaccounts to simplify the accounts-trees somewhat.
  • The root page should feature a prominent input form that allows the user to specify a query! This input needs live at the very root
  • (views) You should be able to filter to all transactions related to some account, e.g. Immigration
  • IMPORTANT! Try to let through some of the non-transaction entries in the view filtering. We obviously cannot let through balance entries, but documents yes, depending on the type of filtering. We should do our best to let all the entries carry through.

Custom dimensions

  • From discussion:
    (digression not about virtual postings but answers auxiliary questions about
    them)
    Now this points to a more general idea that I’ve been pondering for a while:
    these “accounts” can often be seen as a set of flat dimensions, the fact that
    they have a hierarchy can get in the way. I tend to have accounts that look
    like this:
    TYPE:COUNTRY:INSTITUTION:ACCOUNT:SUBACCOUNT
    like this, for example:
    Assets:US:HSBC:Checking
    Assets:CA:RBC:Savings
    For these four dimensions, I actually like having most accounts (Assets,
    Liabilities and Income) specify them in this order. This does not always make
    sense though, especially for expense accounts; for those you wouldn’t really
    want to have a COUNTRY dimension at the root. You want the general category
    only, so I’ll have, for example:
    Expenses:Food:Restaurant
    Expenses:Food:Grocery
    but sometimes the dimensions get inverted too, like in my recent change about
    how to track taxation:
    Expenses:Taxes:US:TY2014:Employer:Federal
    Expenses:Taxes:US:TY2014:Employer:StateNY
    Expenses:Taxes:US:TY2014:Employer:CityNYC
    Here the “institution” is your employer, and shows deeper in the hierarchy.
    Finally, you often do want to have multiple types for the same or similar
    accounts, for instance, to track gains and dividends income from a particular
    investment account, you want a mirror of most of the dimensions except for the
    assets bit:
    Assets:US:ETrade:IRA -> Income:US:ETrade:IRA
    For instance:
    Assets:US:ETrade:IRA:Cash
    Income:US:ETrade:IRA:Dividends
    You see what I’m getting at… these components really operate more like a
    database table with values possibly NULL, e.g.,
    type country institution account category
    -------- -------- ------------ --------- -----------
    Assets US HSBC Checking NULL
    Assets CA RBC Savings NULL
    Assets US ETrade IRA Cash
    Income US ETrade IRA Dividends
    Expenses NULL NULL Food Restaurant
    Expenses NULL NULL Food Grocery
    Having to order your account components in a hierarchy forces you to
    decide how you want to report on them, a strict order of grouping from
    top to bottom.
    So I’ve been thinking about an experiment to rename all accounts according to
    dimensions, where the ordering of the components would not matter. These two
    would point to the same bucket, for example (changing the syntax slightly),
    ExpensesTaxesUSTY2014EmployerFederal
    ExpensesUSEmployerTaxesTY2014StateNY
    You could then display reports (again, the usual reports, balance sheet,
    income statement, journals) for “the subset of all transactions which has one
    posting in an account in <set>” where <set> is defined by values on a list of
    dimensions, a bit like a WHERE clause would do.
    Now, now, now… this would be a bit radical, now wouldn’t it? Many of these
    accounts do point to real accounts whose postings have to be booked exactly,
    and I’m a bit worried about the looseness that this could introduce. One and
    only one account name for a particular account is a nice property to have.
    So what can we do to select across many dimensions while still keeping
    hierarchical account names?
    The first thing I did in Beancount is to create views for all unique account
    component names. For example, if the following account exists:
    Assets:US:ETrade:IRA
    You will see four “view” links at the root of the Beancount web page:
    Assets
    US
    ETrade
    IRA
    Clicking on the link selects all the transactions with a posting with an
    account where that component appears. (Views provide access to all the reports
    filtered by a subset of transactions.) You can click your way to any journal
    or report for that subset of transactions. This exists in HEAD today. You can
    draw all the reports where a particular component appears, e.g., “Employer”, as
    in “Income:US:Employer:Salary” and “Expenses:Taxes:US:TY2014:Employer:Federal”.
    But this does not define “dimensions.” It would be nice to group values for
    these components by what kind of thing they are, e.g., a bank, an instution, a
    country, a tax year, etc, without regard for their location in the account
    name. A further experiment will consist in the following: again assuming
    unique “account component names” (which is not much of a constraint to
    require, BTW, at least not in my account names), allow the user to define
    dimensions by declaring a list of component names that form this dimension.
    Here’s how this would look, with the previous examples (new syntax):
    dimension employer Microsoft,Autodesk,Apple
    dimension bank HSBC,RBC,ETrade
    dimension country US,CA,AU
    dimension taxyear TY2014,TY2013,TY2012,TY2011,TY2010
    dimension type Assets,Liabilities,Equity,Income,Expenses (implicit?)
    You could then say something like “show me trial balance for all transactions
    with posting accounts where bank is not NULL group by bank” and you would
    obtain mini-hierarchies for each group of accounts (by bank, across all other
    dimensions).
    (With the state of my current system, I could probably code this as a
    prototype in a single day.)
    Addtionally, accounts have currency constraints and a history of postings
    which define a set o currencies used in them. More selection can be done with
    this (e.g., show me all transactions with postings that credit/debit CAD
    units).
    IMHO, all you’re trying to do with these virtual accounts is aggregate with
    one less dimension, you want to remove the real account and group by community
    project. My claim is that there are ways to do that without giving up o the
    elegant balancing rules of the DE system.

    In relation to this… these “dimensions”, could they just become other dimensions in the filtering language?

    component:Microsoft

    employer:Microsoft bank:RBC country:US

    You can then break down by those, like a GROUP BY clause, and generate reports that have those as root accounts, or separate breakdowns.

Tags used as dimensions

  • If you had tags as key-value pairs, those could be used as well:

    2014-05-21 * … #employer:Microsoft

    Searching for:

    tag:employer=Microsoft

    This is another dimension in the same filtering language.

Operations

Validation

  • Write a dedicated routine to check the following invariant:

    check_is_at_beginning_of_day = parser.SORT_ORDER[Check] < 0 … if check_is_at_beginning_of_day:

    if isinstance(entry, (Check, Open)): assert entry.date > prev_date, ( “Invalid entry order: Check or Open directive before transaction.”, (entry, prev_entry)) else: prev_entry = entry prev_date = entry.date

  • Sanity check: Check that all postings of Transaction entries point to their actual parent.
  • (validation) In addition to the Check/Open before-constraint, check that the entries are always sorted. Add this sanity check.
  • The default validation should check the invariant that Open and Check directives come before any Transaction.
  • Validation: Everywhere we have a filter of entries to entries, we should be able to apply a check that the total balances before and the total balances after should have the very same value.
  • In validate.py: differentiate between the case of an entry appearing too early before an Open directive, and an entry appearing for an account that simply just doesn’t exist.
  • Auto-detect and warn on likely duplicate transactions within the file.

Padding

  • Idea: Padding entries could be extended a tiny bit in order to automatically calculate the cash distribution entries, e.g., like this:

    2014-03-04 pad Asset:Cash Expenses:Restaurant 60% 2014-03-04 pad Asset:Cash Expenses:Alcohol 40%

    2014-04-04 pad Asset:Cash Expenses:Restaurant 70% 2014-04-04 pad Asset:Cash Expenses:Alcohol 30%

    2014-05-04 pad Asset:Cash Expenses:Restaurant 70% 2014-05-04 pad Asset:Cash Expenses:Alcohol 30%

    This is a great idea, is in line with the general meaning of pad entries (implicit 100%) and would add a much desired feature.

  • Add tests for all the cases of realization padding.

Locations

  • @location really should just convert into a generic event “location”, just as address and school should; they’re just events with forward fill… Serve this at:

    http://localhost:8080/20120101/20130101/events/location/days

  • Add a “reason” field for @location, and display as trips, with some sort of meaning to them. Ok, this contradicts the previous idea.

Parser

Errors

  • Parsing an indented string results in some output… for example, this code spits out a single Balance entry (the last one):

    balance_entries, , _ = parser.parse_string(“”” 2014-01-02 balance Liabilities:CreditCard 100.00 USD 2014-01-03 balance Liabilities:CreditCard 200.00 USD “”“)

    It should not. It should output nothing, because these entries are indented. I don’t understand why it does. This is a parser bug which needs to be traced and debugged. This has come up in writing tests here and there, forgetting to insert dedent=True.

  • We need to gather errors in a single place and report them like the others; right now I’m catching them in sum_to_date() and writing using the logging module; but they really should be trickled up with the rest.
  • Syntax errors currently have no location… this is unacceptable. Write an automated test, check for all kinds of errors, in the lexer, in the parser, in the Python. (Just work with the line number, we don’t really need character position.) Test everything with automated tests.
  • ‘lineno’ is incorrect, it points to the next entry, not the previous one, fix this bug! This is really annoying.
  • Set a correct filename in grammar.y
  • Errors from the parser and others should all be accumulated in one place, so that we do all the reporting at the very top level.
  • Don’t raise error exceptions anywhere; log everything to an error handler/accumulator class instead, and skip to the next entry/declaration.
  • Propagate exception from Python(?)
  • Problematic transactions (!) should spit something of color on stdout, they should not be forgotten so easily.
  • Bug: Invalid account names should only be reported once.

Lexer Work

Errors in Flex Lexer

  • (parser) Support enabling flex debugging in beancount.core._parser.parse(), using “yyset_debug(int bdebug)”.
  • When an error occurs, skip the lexer to the next empty line and recommence.
    • Modify the lexer to emit EOF and add that in the grammar rules for empty_line.

Write a New Lexer From Scratch

Reasons to write your own lexer manually:

  • Should support UTF-8 encoding.
  • Should support SCHEDULE entries for org-mode (see email discussion).
  • More flexible syntax (see “Is it possbile for beancount to ignore org-mode SCHEDULED and DEADLINE?” thread).
  • Better error reporting
  • (performance) Write your own lexer manually and compare performance with flex one. I think we can do a much better job at error reporting by writing our own, but I’m unsure how the performance compares.
  • IMPORTANT LATENT ISSUE. You need to extend the lexer to parse A-Z for flags, not just PSTCU! This is important, as I just realized that it could prevent the correct parsing for entries in a round-trip, with postings produced with unexpected flags. In fact, any character with whitespace on each side should parse as a flag. This is very important.

    This manifests when adding a posting with letter ‘M’ right now. Replicate this, fix the problem.

Make the Parser Reentrant

Syntax

  • You should support a payee with no description! This generates a parser error right now.
  • The syntax should support multiline strings naturally…
  • For Links vs. Tags: dont impose an order, parse as tags_or_links. Right now the order is tags_list and links_list.
  • You should accept commas in the input; simply ignore their value.
  • Add 1/rate syntax for prices (and anything… really, why not). Convert at parsing time.

Sensible Syntax for Lots

  • Consider making the lot syntax like this:

    -4 {HOOL @ 790.83 USD}

    instead of:

    -4 HOOL {790.83 USD}

    It’s actually a lot more accurate to what’s going on…

Testing

  • Allow file objects to parse() and dump_lexer(). This should use fdopen() or whatever else to get this job done at the parser level.
  • You need to clean up the memory of the strings created; call free() on each string in the rules.
  • Add a unit test for pushtag/poptag errors.
  • Add unittests for tags, pushtag/poptag

Reports

Text Statements

  • Complete text-statements branch, printing out balance sheet and income statement to text for bean-report and ELI, complete with –date on those reports.
  • Using an intermediate data structure, produce text and csv / xls reports, downloadable from the web front-end, or even generatable directly. All of this reporting infrastructure should be reusable, ideally.
    • A text rendering of the balance sheet / income statement would be very useful for collaboration/communication with others. Add a link to download a text version of any report. This would be made easy if we only have a few distinct types of reports.

List of Unreconciled Transactions and Postings

  • Perhaps we want to produce a report of all transactions with a highlight on them.

Balance Sheet

  • (web) We really need to reorder the accounts in a way that is more sensible… it’s annoying to see the accounts I care about at the top of the page. Cash, Points, AU, should be at the bottom… I wonder if there’s a nice heuristic. Last updated date? I think that would be good enough.
  • We need to figure out how to order the accounts on the balsheet; I want the most relevant near the top. Sorting accounts: compute a score made up of
    • nb. txns from the last 3 months
    • nb. checks from the last 3 months (weighted more heavily)
    • line-no of Open directive in the file.
    • last updated date.

HTML Rendering

  • Rendering: When you collapse a parent account, its aggregate amount should render, and disappear when not collapsed.
  • Numbers should align to the right.
  • USD and CAD should be aggregated in their own columns for balance sheet and income statements. These should be specified from the command-line.
  • All entries should have collapsing a-la-JavaScript, along with collapse/reveal all buttons. All JS.
  • If the software is finally fast enough in Go, render RESTful on the fly for any date:
    • REST: balsheet/20121231
    • REST: income/20121231/20131231

    This way, you could have any year on the same run. No need to restart, even have a menu with all the years, and perhaps even some of the latest months.

  • It would be really nice to render the line numbers to the source in the HTML
  • (Performance) Implement buffered output for faster response using a separate thread and an iterator that yields from app.write when the data buffer is large enough.
  • Postings that have a “!” flag should have at least their background red.
  • You should more explicitly state the beginning and ending period on each statement pages (it is super important information). Just below the title.

Excel Output

  • Find good ways to transfer data to an Excel spreadsheet. A link to download a file should be supported.

Links to Source

  • The new format code should keep and optionally render the source file/line of any transaction, and allow clicking to get to the source line in the rendering.
  • Maybe there should be a script that can take a report specification and output a list of emacs-compatible links to the entries, interspersed with the text format rendering! You could go “next-error” to go through the entries in time order, emacs taking you there one-by-one.

Event Reports

  • We should be able to count the days of each event type.

Distribution of Expenses and Income

  • Add a pie chart to visualize the constitution of the Income Statement for Expenses and Income.

Summary Reports

  • To create custom views, for example, weekly summaries, you could convert the ledger into another ledger, where entries would have been replaced by summary entries instead, and all the other functionalities would still work.

Financial Ratio Analysis

Trades

  • You should be able to report on all booked trades during a period, especially with the new booking algorithm, this will be useful. Create a new report type for this.

Reports: Overview Stats

  • Add a simple overview ‘stats’ report with output like this:

    Files these postings came from: …

    Unique payees: 2681 Unique accounts: 151

    Number of postings: 9026 (4.8 per day) Uncleared postings: 126

    Days since last post: -206 Posts in last 7 days: 30 Posts in last 30 days: 52 Posts seen this month: 8

  • This ought to replace the current stats reports, which are simply too small and specialized.

Reports: Rendering Journals

  • Journals should render in either order.
  • Add an option for spacing in the revamped reports.
  • When multiple transactions occur in the same day, it makes sense not to render the balance amount until the last one. Do this. This is important.
  • In rendering balance directives, don’t render the amount in the “change” column; that is too confusing for some users, keep the change column for changes.
  • (easy) Don’t render postings in the HTML interface by default. The detail can be made available via bean-query now, and users can click on /context link in order to get the full transaction detail. Journals should be summaries. Add an optional argument to turn it on/off, but it should be off by default.

Reports: Rendering Tables of Balances

  • tree rendering: If a parent account has only a single subaccount and the parent account otherwise has no postings in its realization, render the account on a single line instead of two, e.g.

    Expenses Expenses Taxes Taxes US US TY2014 TY2014 State State:Company Company … …

Reports: Rendering Tables

  • For table rendering, move the actual formatting at rendering time. CSV files should have fractional values for percentages, txt and html should have % values. I need to figure out a good solution for this. Maybe the thing to do is to move the field selection at rendering stage, or at least to have it at both.

Reports: Accounts

  • bean-report accounts should produce a regular table that can be rendered with CSV or TXT, not a custom output string.

Reports: Cash

  • Create a function to identify whether a Position/Inventory is cash… use this to reproduce/replace the ‘cash’ report. Use the same rule from bean-report.

Report: XML

  • Output to a structured XML format, some people are finding that useful to build other visualizations. In order to test this completely, do a round-trip test. The code should live in beancount.parser, parallel to the existing code there.

Report: Events

  • Build a ‘events’ report that will print out the current value of all events.
  • Create a new report type: “days” that counts the days of any event in the filtered log.

Price Freshness / Dated Prices

  • Issue warnings if the price date is too far from the requested market value date. This will help with returns, a lot. You should likely do this in the price_map object, in the lookup function, maybe, so that all modules benefit from the feature. You could ideally provide a date and a tolerance, and somehow issue warnings automatically. The tolerance for price freshness should be provided as an option in the input file.
  • Add the capability to issue warnings when the price database is queried for a specific date but the price point is too distant from the requested date.

    “One thing I want to do soon is to issue warnings when the price database is looked up and the price point is too far from the requested date, so that the user could go fill in the missing prices. I’d probably issue price entries with the approximated price (approximated with a distant date) and then feed that into another script that would fetch prices for those directives.”

Review Design of Tags & Links

  • Consider removing the “links” attribute altogether and making that simply just a tag. Do we really, truly need to distinguish between tags and links?
  • Try removing the ‘tags’ attribute of transactions by moving it to metadata fields and making sure tools are available to perform the same aggregations using bean-query and views using bean-web.
  • Make the Pad directive accept #tags and propagate those to the generated transactions: https://bitbucket.org/blais/beancount/issues/70/add-tag-to-the-pad-directive-and-propagate

Web Interface Improvements

  • (easy) Make b.w.web also ‘app.options_map’ instead of ‘app.options’.
  • Make the web interface not render postings anymore by default.
  • web: Don’t render the transaction detail anymore; instead, the full context should come up in a tooltip that is computed on-demand. Rendering the basic table should not have to render all the detail upfront, that is always overkill.
  • web: Don’t render the full Inventory’es; instead, already render at cost and provide their full detail either by clicking on the transaction, which should render the full detail of an inventory (for debugging), or in a tooltip.
  • Highlight (e.g., in the color red) the postings to accounts that are in contra value, e.g. a positive Income, or a negative Expenses posting. (Maybe this is just something that lives in Fava actually)
  • In the web interface, it would be nice to have a fancy client-side JavaScript overlay here, that automatically appears after parsing if there are errors and that automatically smoothly fades out. All errors should be displayed in an overlay; proper error handling and display for the web interface is not optional, it’s important.
  • Figure out how to disable googleapis fonts when on a very slow connection. I’d like to enable the fonts, but if they cannot be fetched quickly, or cached, this should be disabled.
  • Implement a little plug-ins system that allows a user to insert a new tab in bean-web, to insert some custom display.
  • Instead of rendering inventories with the full contents in the journal, render the cost, and place the full inventory in a tooltip!
  • A table of price entries should be rendered under the price graph in the web interface.
  • The Trial Balance page could be a good place to put all the accounts on the left and have two sets of columns: beginning of period -> end of period.
  • In order to implement .txt output, you will need to decouple the web rendering and the generation of its included data. This will be great–ability to cut-and-paste any page into txt. format=txt, and we could still have the links clickable. Everything else would just be txt. A bit of a crazy idea, but it might work well and be simple. Maybe.
  • Render tags on the HTML page
  • Replace gviz charts by some other library that does not require you to be online in order for it to work.
  • Silence BrokenPipeError errors from bottle using wsgiref. You could use CherryPy, which doesn’t suffer from that, or just… fix it and silence them.
  • In balance/aggregate reports, render the balances for parent accounts too, in lighter gray.
  • Render the Conversions amount at the bottom of the Conversions page…

Visualizations

  • Move the TreeMaps experimental script for Expenses and Assets subtrees into the bean-web codebase: https://bost.ocks.org/mike/treemap/ (Or maybe that would just be left for fava.)

Rendering Documents

  • Serving CSV files from the Documents page should not done be via download, but rather they should rendered directly. Same with text files. As much as possible files should be served directly.
  • The documents web page should render by-month + date, and by-account + date.

Update Activity Page

  • Update activity: Remove parent accounts with no child accounts.
  • Update activity: This exhibits a bug in the table rendering, look for IVV, see TODO(blais) in acctree.py

Bake Improvements

  • Move beancount.scripts.bake to beancount/web and adjust all the references accordingly.
  • bake: Make bake support curl if wget is not available. It should work with either, to relax dependencies.
  • Make the web scraper run in multiple threads… it’s quite a bit too slow as it is. I’m sure we can make it scrape in parallel using multiprocessing and a work queue (this should be a fun little project and would make baking to an archive a lot faster in many cases).

Programmable View

  • GREAT IDEA! Have a web form that you can input a view filtering expression, e.g. year:2014 component:Microsoft to have that year’s transactions made with this component. Encode the results in a unique string that you can decode and create a corresponding view of the subset selected by the expression. You can then view any of the reports for that subset! This means we can then get rid of many of the root page’s links automatically, yet still provide all the opportunities… this is the way to go, and would best mirror the command-line capabilities.

Error reporting

  • We really need to list all the entries markes ‘!’ somewhere; they should be more annoying to the user.
  • In the balance sheet or trial balance, mark accounts that have errors in red, or add a little red start next to them.
  • Implement basic error reporting page from the list of errors.

Links

  • (web) Render links to the right of descriptions, and the link href link should actually render a page of the related linked entries.

Aesthetics

  • In the entries table HTML, highlight the relevant posting leg somehow, maybe use a ‘>>’ marker, or make it bold. Something. (Bold is too much, use >>.)
  • Render “as of YYYY-MM-DD” under the title for Balance Sheet, and “from YYYY-MM-DD to YYYY-MM-DD” under the title for Income Statement
  • Answer to favicon.ico request.
  • Add an option to render the entries table upwards, not just downwards.
  • Use that beautiful new font (Roboto) from Tactile in the new rendering. Totally worth it. Use the nice Lucida-like font for numbers, like in TurboTax.

JavaScript / Client-Side Interaction

  • Render balance sheet/ income statement cells with two numbers for parent nodes, so that when you collapse a node, all the amounts of its children sum up automatically and display in its cell. You should have a consistent report regardless of whether nodes are collapsed or not. This will require some JavaScript effort.
  • Implemented a JavaScript cursor in JS. J, K up down. SPC = toggle.
  • In Journal view, pressing ‘C’ should toggle displaying the checks on and off.

Trial Balance

  • We should have a nicer way to tell what accounts need to be updated. Highlight them red if they haven’t been updated in more a month (configurable). Put the last updated date in the balance sheet or perhaps the trial balance page. Should be easy; we don’t need a dedicated page for this.
  • Shove more information in the Trial Balance page, info about errors, documents, etc.

Source

  • The source page should take a special ‘?line=ddd’ parameter that will scroll the page to the transaction at that line.

Conversions

Small Projects & Challenge Ideas

Maximum Balance

  • Can I compute the maximum value of each account at the end of every year (for foreign assets decl.) This would be useful for FBAR / FATCA declarations.
  • Automatically compute the maximum account values of foreign accounts for the FBAR filing.
  • You should report a trial-balance -like listing of the minimum and maximum values of all the accounts.

Property Lifetime Return

  • Challenge: Can I compute IRR return on my condo purchase and sale, accurately accounting for all the little expenses and cash flows along the way? TODO: Add benefits received as an Income, as transactions.

Taxation Rate

  • Challenge: Can I automatically compute my taxation rate for every year?

Currency Exposure Report

  • For a particular balance sheet, report the total currency exposure of the ledger. This should be a very simple report, probably in the form of a pie chart. Maybe this pie chart should be located in the capital report (possibly makes sense).

Inflation Adjusted Reporting

  • It would be AWESOME to look at a balance sheets from the past but inflation-adjusted for any date… Answer this question easily:

    “What was I making in 2010 in today’s dollar terms?”

  • How would I produce an inflation adjusted version of some charts. Maybe all charts should have that option?
  • Look at average meal 10 years ago, average electricity, etc. things that should be equal, and correct for the time-value-of-money, compare prices today with prices then. Maybe come up with some kind of constant unit that I can convert everything to.

Account Linkage Report

  • Generate a Graphviz link of all the inter-account transactions that are larger than a certain amount.

    Generate a graph for the main kinds of account interchanges, by looking at the most significant transactions between the accounts. For example, ignore transactions which are too small in absolute value, or whose total is too small a portion of the total.

    Fun little project: Create a graphviz output from a Ledger, where transactions above a certain amount would generate a link between accounts. Note: the threshold could be either for single transactions, or for aggregate amounts (absolute value).

Predict Vacation Cap Date

  • Make an plugin that computes the precise date at which my vacation will cap (240 VACHR) base on an account.

Account Hiding Criterion

  • Set the closing criterion for empty accounts, implement in a single place, and review all the code which renders balances to use it.

    “I used to have it so that accounts closed before the beginning of the exercise (in the reports) would not appear. Accounts closed at the end of the period but with some activity within the period would appear (so you can click on them and see their journal). Opened accounts with a zero balance would, too. Closed before begin + no activity = no show.”

    2000-01-10 open Assets:Continuing 2000-01-10 open Assets:Empty 2000-01-10 open Assets:Terminated

    2000-01-10 open Equity:Whatever

    2014-03-10 * Assets:Continuing 110 USD Assets:Terminated 120 USD Assets:Empty 130 USD Equity:Whatever

    2014-03-30 * Assets:Terminated -120 USD Assets:Empty -130 USD Equity:Whatever

    2014-05-15 close Assets:Terminated

    2015-01-10 * Assets:Continuing 110 USD Equity:Whatever

  • Whether an account shows up in a particular Ledger (realization) really only should depend on whether the account was open during the period (we now have account open/close dates… let’s use them instead of a heuristic!). Create a routine to figure out if an account was open during a specific time period?
  • Related topics: (ticket) https://bitbucket.org/blais/beancount/issues/36/balances-output-sometimes-outputs-000 (fava) https://github.com/aumayr/fava/issues/292#issuecomment-219563582
  • One question is whether we should display an account which has a non-zero balance but when rounded for display rounds to a zero number?

    “the problem is not the sell, but the buy that leaves -0.00120 USD in Assets:XYZ”

Rejected Ideas

  • Why aren’t we using the price on the first leg of this transaction? This is an interesting variation on the meaning of the price: it could mean either (a) the price of the lot, or (b) the conversion price of the cost of the lot. This would enable the following:

    2013-07-22 * “Bought some US investment in a CAD account” Assets:Investment:HOOL 50 HOOL {700 USD} @ 1.01 CAD ;; 35350 CAD Assets:Investment:Cash -35359.95 CAD Expenses:Commissions 9.95 CAD

  • Add a –auto-everything option to bean-check that automatically inserts a beancount.plugins.auto_accounts directive and more.

To Be Categorized

Reporting Plugin Errors

  • Build a utility function to parse plugin configs using ast.literal_eval() and catch and produce a consistent error message when an invalid Python expression is provided. Convert all the plugins to use it.
  • When rendering errors, render the data type so the error tells you which component or plugin generated it.
  • Parse and save the line no for “plugin” directives in order to improve their error reporting.

doctor / tools

  • Add a list of posted balances by account to “bean-doctor context”. “linked” always links all the transactions, so it’s not good enough. Just add this by default to the context command.
  • Great idea: Make the printer able to (1) output “incomplete” entries and (2) print out the entries in the order of (filename, line-no) in which they were parsed, to produce a file that is as close as possible to output. (bw3443 asked for this in order to make modifications to his input and push that back out to a file, this could be useful.)

Improvements to Emacs Support

  • Investigate indent-region and see if it wouldn’t make sense to override this in order to invoke beancount-align / beancount-format.
  • Add an Emacs mode command to compute a command at the date of the transaction preceding or at the cursor line (with C-u). Without a C-u prefix, compute the balances at the latest date. The point is that the user shouldn’t have to type in the bean-query <filename>, other than perhaps having to type “balances”. To work quickly on smaller files.

Transaction Lists which can be Explored with Emacs

  • Add a “lineno” format for transactions that render in “Emacs errors” compatible format, so we can easily jump in time throughout the input file instead of rendering a journal. Offer the option to list in reverse to. This should be invokable from the SQL shell.
  • Register (with filter) should have “print” mode that also includes file:lineno so that we can make Emacs “jump” between the transactions in the order they appear.

Trading

  • Produce example transactions for each of the cases provided by Filippo’s sample transactions.

Tools: Formatting

  • BUG: There is a bug in bean-format; on /home/blais/sharing-with-roommate/liabilities-account-solution.beancount, it fails with an assertion. I highly suspect that it’s because of postings like this: Expenses:Electricity 45.34/2 USD Liabilities:Alice 45.34/2 USD

Crypto

Start thinking about some features for crypto users.

  • Make it possible to transfer lots with cost between accounts without having to specify the full detail (use matching).

Misc Grab Bag of Ideas

Here follows a grab bag of ideas. When I have a new idea coming to me, I don’t have time to think about where to put it, I just come here at the end and jot it down. Every couple of years I clean this mess up and put it in the sections above.

  • Add HIGHEST_COST booking method.
  • Make queries with group by order by unaggregated columns by default (output should be stable even without ORDER BY clause.
  • IDEA: In linked and context commands, output the sum of all positive and all negatives numbers (like “Net Income” sum in ‘linked’).
  • IDEA: For ugly transactions like this, it’s convenient to repeat account names and amounts, like this:

    2020-08-09 * “Delta Dental” “Processing claim” ^deltadental-XXXXXXXXXXXx service: 2020-08-07 Expenses:Health:Dental:NonEligible 295.00 USD Expenses:Health:Dental:NonEligible -50.00 USD Expenses:Health:Dental:Deductible 50.00 USD Expenses:Health:Dental:NonEligible 0.20 * -202.00 USD Expenses:Health:Dental:Coinsurance 0.20 * 202.00 USD Expenses:Health:Dental:NonEligible 0.80 * -202.00 USD Expenses:Health:Dental:Allowed 0.80 * 202.00 USD Expenses:Health:Dental:Reimbursement -161.60 USD Assets:AccountsReceivable:Anthem:Dental

    But we don’t want them applied as such. It would be nice to have those automatically turned into:

    2020-08-09 * “Delta Dental” “Processing claim” ^deltadental-XXXXXXXXXXXx service: 2020-08-07 Expenses:Health:Dental:NonEligible 43.00 USD <— here. Expenses:Health:Dental:Deductible 50.00 USD Expenses:Health:Dental:Coinsurance 40.40 USD Expenses:Health:Dental:Allowed 161.60 USD Expenses:Health:Dental:Reimbursement -161.60 USD Assets:AccountsReceivable:Anthem:Dental

    It’s fine if this is done only for postings with the same currency and no cost basis, or postings with a particular flag on, or just for some subset of accounts. “automeld”? This would be nice in v3, as the original parsed transaction can keep all the original detail while the reduced one only appears in the finaly flow.

    Or maybe… this shoule be global and automatic? Impelment it, and print out which transactions would be affected to find out.

  • `bean-doctor linked` command should render the balances after each transaction. Either horizontally (new columns), or a dedicated tree right after each. Add an option. This would be nice for debugging.
  • `bean-doctor linked`: The account types aren’t in the correct order. It shows Assets, Expenses, Liabilities… put all the balance statemnet accounts first, then income statement accounts, finally, equity.
  • Plugin idea: A plugin that ensures that there is no residual commodity in an account as of its closing.
  • For Vanguard RothIRA, I need a balance check that applies only to a subset of linked transactions. I’m being provided with this number in the imported file but I have to split the transactions themselves.
  • Turn all FIXME into TODO(blais) and NOTE(blais).
  • “find $PWD | treeify -F” is broken; used to work. Update and fix this.
  • Make the spreadsheets <-> beancount upload/download conversion scripts more general and flexible and provide them to the general public. These are really powerful.
  • When env var L is set to a directory, failure just says that the Lex scanner fails; this should mention the name of the file, so it’s clear. Sometimes I override “L”.
  • In order to render all linked transactions, support pulling the referenced link or tag name under the cursor (in Emacs).
  • In beancount.plugins.auto_accounts: It would be really nice to constraint which types accounts can be auto-created, for example, don’t auto-create Assets/Income accounts. This would be helpful in trip files.
  • Idea! Add original conversion rate on an augmenting leg with cost basis, in order to be able to partition profits due to appreciate vs. currency drift. See https://groups.google.com/d/msgid/beancount/939fe47e-382a-4272-94e6-dd29f660d519%40googlegroups.com?utm_medium=email&utm_source=footer

    “This brings up an interesting point, one which I had never given much though before: if we wanted to solve this problem - and I think it may be worthwhile to do so - couldn’t we just (optionally) store the original exchange rate at the time the lot was converted, with the lot (i.e., Position object) itself? That is, each lot would store:

    • units (amount, currency)
    • cost (amount, currency, acquistion-date, label, rate*)

    (*) This being the new bit.

    Given those rates, we should be able to compute the current value of the cost basis in the local currency, and that against how much it cost us originally, should tell us the proportion of gain/loss due solely to currency appreciation/depreciation (ignore the effect relative interest differences between the two currencies).”

  • Three interesting ideas for extending booking further here: https://groups.google.com/d/msgid/beancount/9c9dcc5f-72bf-44b2-aaf9-4e0c07cbff77%40googlegroups.com?utm_medium=email&utm_source=footer
    1. When you apply the partial booking specification, it could be applied against the booked legs of the transaction in order to select cost bases. I’m not sure if this generalizes.

      “Finally, the partial specification, say, just specifying the date, is used to narrow down the lot against the list of possible lots in the Inventory of the account before applying the transaction, but NOT against the list of other postings. That would be an interesting power to add to the booking system, as it could disambiguate this case.”

    2. One could assume a single currency group in each transaction when there’s at least one posting at cost.
    3. One could also assume that in that account a commodity is never priced in terms of two different currencies.
  • Allow plugins to run before and after booking; this would make it possible for plugins to run on CostSpec and fill in more information. In fact, maybe the booking process itself could be moved to a plugin. This coudl be a powerful idea, in that it would clarify the distinction between the two streams of transactions.
  • Add support for this data source: https://www.alphavantage.co/documentation/
  • Sometimes it might be useful to end the stream of useful transactions at a particular point in the input file. Normally this is best done by filtering in the queries themselves, but there are times where having an End directives specified temporarily in the input file would have been useful (e.g. when traveling and running temporary balances while editing the input file). Consider adding this facility, it might be a convenience, especially to get around some of the absence of time issues.
  • In the new SQL shell, allow the plugins to define new subsets of postings (transactions), so that they can be queries separately. For instance, when I’m traveling, it should be possible to query the set of transactions before or after the split_expenses plugin ran. This would actually have been useful while traveling, because the split_expenses plugin doubles up a lot of the postings (that’s what it does…).
  • Document the “{ <num> # <num> <ccy> }” syntax properly, in a dedicated place.
  • New currency sources: https://www.ecb.europa.eu/stats/policy_and_exchange_rates/euro_reference_exchange_rates/html/index.en.html https://news.ycombinator.com/item?id=15616880
  • Idea: Add “contra” as a new column, when the sign posted in the wrong direction based on the account type. Or add a boolean “contra” virtual column (or function, from account name).
  • Here’s another way to implement an intra-day balance: define a new balance directive type, whose semantic is tp sort all the entries, but the entries on the same day of the balance are sorted in file order. I think this would handle the most common use case. Try it (experiment).
  • It might be interesting to support some sort of transfer syntax that would allow the movement of Position’s across accounts, without converting to the cost, something like this:

    2000-01-01 * “Transfer shares from BrokerOne to BrokerTwo to sell at BrokerTwo” Assets:BrokerOne -10 HOOL {} Assets:BrokerTwo

    While this example is simple enough, a fully general version of this might not be, but it’s worth prototyping. This would require some sort of new syntax for these “transfers.”

  • Rationalize the strategy for reporting errors.
  • Rendering idea: If a posting moved the balance in the unusual direction (positive for Income, negative for Expenses), it could be rendered with a highlight or a different color. These are rare and unusualy. One could also do the same for Assets (when the amount moves up) or Liabilities (when it moves down).
  • Create a new “AssertQuery” directive to assert that the result of a particular query produces a particular Inventory. For example, you could check your total contributions to an account over a period of time like this:

    2018-03-11 assert_query ” select sum(position) where year = 2018 and account ~ ‘Assets:US:Vanguard:Retire:AfterTax:Cash’ and number > 0” == 27250 USD

  • In the account name completion function in Emacs, don’t include the name of closed accounts.
  • Ingest should sort the transaction using a secondary “txndate:” metadata field and document this somewhere.
  • Idea: Upon an irrecoverable error in the postings, instead of skipping a transaction, record it without any postings. This will allow “bean-doctor context” to at least find the offending transaction.
  • Idea: Limit the absolute value that can be posted to rounding accounts. The limits could be automatically derived from the tolerance values.
  • Integrate the docs from here: https://bitbucket.org/aumayr/beancount
  • Idea to deal with #145: Add all augmenting postings, THEN book all reducing ones against that.
  • Full booking: We need an example of two reducing bookings competing for the same lot.
  • Implement IPython/Jupyter notebook support for directly viewing the results of queries.
  • “bean-doctor linked” should list the list of links that were followed. That would go some of the way to explain some of the sometimes surprising results.
  • SQL: OpenMeta is different than Meta; it returns a dict. That’s bad. Provide a direct lookup, like Meta.
  • Complete tests for csv importer; it should be tested frantically, and a decent sniffer should be implemented, specifically with credit/debit support and balance vs. posting amount support, as well as detecting fees.
  • Add an AVERAGE function to the SQL shell. See related discussion, I would like to compute this for myself too: https://groups.google.com/d/msg/ledger-cli/dWuWysV6qZU/cyYNzleTCAAJ
  • There ought to be some unit tests for all the custom metadata added to Posting instances. I think this is undertested at the moment.
  • Implement a new plugin to convert metadata-fields to Document directives.

    Scan all transactions that have the invoice-metadata field (or any other suiting field-name of your choice), and create a Document directive for each, with a link to the transaction it comes from.

    In doing so, add support for “fuzzy-matching”: When only a relative filename is specified in the metadata-field (and Document directive), search different locations based on the Account this is on/in.

    See https://github.com/aumayr/fava/issues/386#issuecomment-256895713.

  • Core: Some problems occur (with the new method) if you mix at-cost and without-cost for the same currency, like this:

    2015-09-15 * “Buy” Assets:Investments:HOOL 50.795 HOOL {14.63 USD} @ 14.63 USD Assets:Investments:Cash -743.13 USD Equity:Vanguard:Rounding

    2015-09-24 * “Dividend Received” Assets:Investments:HOOL 0.252 HOOL {14.11 USD} @ 14.11 USD Income:Vanguard:Retire:AfterTax:HOOL:Dividend -3.56 USD Equity:Vanguard:Rounding

    2016-05-09 * “Conversion From” Assets:Investments:HOOL -51.047 HOOL @ 14.33 USD Assets:Investments:Cash 731.50 USD Income:Vanguard:Retire:AfterTax:HOOL:Pnl

    2016-05-10 balance Assets:Investments:HOOL 0 HOOL

    2016-05-10 * “Buy” Assets:Investments:HOOL 292.285 HOOL {14.33 USD} @ 14.33 USD Assets:Investments:Cash -4188.44 USD Equity:Vanguard:Rounding

    The better way to deal with this would be not to allow it to happen. Because this needs to be applied during matching, we cannot implement this in a plugin, this needs to become an option (to allow it, and to make it disallowed by default) in the parser itself.

  • The Beancount loader cache should be automatically invalidated if the revision of Beancount has been updated. This could avoid some potentially confusing user interpretation of errors.
  • Idea to deal with #145: Add all augmenting postings, THEN book all reducing ones against that.
  • Full booking: We need an example of two reducing bookings competing for the same lot.
  • Implement IPython/Jupyter notebook support for directly viewing the results of queries.
  • “bean-doctor linked” should list the list of links that were followed. That would go some of the way to explain some of the sometimes surprising results.
  • SQL: OpenMeta is different than Meta; it returns a dict. That’s bad. Provide a direct lookup, like Meta.
  • Implement beancount.plugins.auto and beancount.plugins.pedantic finally.
  • You need to add options to provide the precision to use for prices, otherwise something like this:

    2016-07-15 * “Resto” “Nice dinner near Storm Hotel” ^b4cd7330bc3d Expenses:Food:Restaurants 12500 ISK Income:Martin:CreditCard -103.18 USD @ ISK

    will render as

    2016-07-15 * “Resto” | “Nice dinner near Storm Hotel” ^b4cd7330bc3d Expenses:Food:Restaurants:Martin 6250 ISK ; 6250 ISK Expenses:Food:Restaurants:Caroline 6250 ISK ; 6250 ISK Income:Martin:CreditCard -103.18 USD @ 121 ISK ; -12499.99999999999999999999999 ISK

    instead of

    2016-07-15 * “Resto” | “Nice dinner near Storm Hotel” ^b4cd7330bc3d Expenses:Food:Restaurants:Martin 6250 ISK ; 6250 ISK Expenses:Food:Restaurants:Caroline 6250 ISK ; 6250 ISK Income:Martin:CreditCard -103.18 USD @ 121.148 ISK ; -12499.99999999999999999999999 ISK

    Right now, the only solution is to provide at least one price with a suitable precision.

    Another solution, instead of letting the user select the precision, for inferred prices, would be to infer maximum precision by default if a digit of precision hasn’t been seen yet. Round prices shouldn’t imply 0 digits of precision when it comes to inferred prices, for rendering.

    Note that the actual numbers derived are correct, the only problem is how they get rendered by default.

  • Try to modify b.l._parse_recursive() to reuse the DisplayContext object across all the parsed files, in order to accumulate its full state. Some users (from fava, e.g. Daniel Bos, see thread on mailing-list) are defining a single top-level file with only includes, and this breaks the rendering of the fractional digits in the web interface in a severe way.
  • Investigate using PrettyTable over my custom one. Is there any point to that?
  • Put all the contents of experiments/ under their own directories and add decent README files for each of these.
  • Write a new plugin to spread cash transactions over multiple dates. Use a template transaction in the input file as input.
  • Setup continuous integration that would push build status for each commit to bitbucket’s build status API. Buildbot already has a plugin for this.
  • In upload_assets.py, add the export date somewhere very visible.
  • Convert the upload.py script to use the Sheets v4 API instead of gspread.
  • Produce pivot on monthly expenses as a custom report, the need for this occurs too often to wait for the SQL shell to support it.
  • The Beancount presentation should include the table’s implicit join, graphically, and then explain select/filter/aggregate. Include a section in the design doc about this, or perhaps even in its own doc.
  • Add support for the ABC’s for the pure interfaces, e.g. beancount.prices.sources, beancount.ingest.importers, etc.
  • Start accumulating test cases from real data for similarity classification, and rewrite this function from scratch, it just doesn’t work well at all.
  • Port all my wash sales scripts to use the new booking method. This ought to be working now, with the new booking method.
  • Create an explicit license for my documentation, one that is compatible to publishing this as a book later on.
  • BUG: The “bean-doctor context” command does not provide any useful information if there was an error on the transaction at context and its postings are empty. If we’re looking to get the set of lots before and after, it’s not obvious.
  • Compute and render the difference on a sellgains error. In order to make this work, you’ll have to implement a difference between inventories.
  • This will fail because of an unfortunate ordering in the transactions on the same date:

    2014-08-11 * “Google Massage - 60 minute Table Massage” | “Online purchase” Assets:US:GoogleInc:Wallet -60.00 USD Assets:US:GoogleInc:Massage 120 RUB {0.50 USD}

    2014-08-11 * “Table Massage 60 mins - GB” ^7430817d565c Assets:US:GoogleInc:Massage -60 RUB {0.50 USD} Expenses:Health:Healing:Massage

    This is normally fixed by fudging the date on the second transaction to 8/12. The thing is, we could either

    • Make the check allow a temporary negative balance if on the same date, or
    • Reorder the transactions so that the positive amounts always occur first. (I’m not sure that’s always possible, imaging a combo of postings that requires the opposite order in order for this to work).

    This problem occurs rarely, but when it does, it’s annoying to deal with because the error removes the transaction and it cascades to more errors. Fixing this would help usability a fair amount.

  • This type of interpolation with per-account currency constraints should just work:

    YYYY-MM-DD open Expenses:Communications:Internet USD YYYY-MM-DD open Assets:US:Points:Verizon VERIPTS

    2017-03-12 * “VERIZON ONETIMEPAY VERIZON.COM FL” “PHONE SRV” Liabilities:US:CreditCard -178.37 USD Expenses:Communications:Internet Income:US:Rewards -650 VERIPTS Assets:US:Points:Verizon

  • You should definitely do some anomaly detection on the list of prices, when prices go out of whack with the rest of the time series. I had this happen with a rewritten and buggy new price source. This could be done in a plugin and would be useful, as invalid prices can manifest in some wildly incorrect totals, and this is a common error. Easy with good payoff on correctness. Make this part of the pedantic list as well.

Google Keep Tickets

  • Unicode support: add a basic filter with ‘ignore’ to a temp file, before I finish having a proper lexer. Kick the tires off a lexer anyhow.
  • Idea: select * from transactions, but you have to find a way to open/close/clear either before filtering or after.
  • Print out metadata from the C-c x bean-doctor context command
  • Get to a point where you can compute long term vs short term
  • For testing decimal, work the renderers from a given dictionary, either computed or given glabally. A maximum should be enforced too.
  • Testing for rendering Needs to test with numbers under prec, over prec, integers, with wild prec (enforce max).
  • Add a display_precision option yo allow the user to override it.
  • Add warning on repeated metadata value key
  • Returns: identify entries with intflows + extflows on one transaction, without assets, they should be printed out for review in dates after all assets accounts are opened
  • Preso: offer an estimate of data size for a ledger and show how its really small.
  • Finish prior to spreadsheet replicating g finance
  • Add change from trailing day, week, month, helps making decision based on recent trends
  • Idea: introduce balance checks on preso, immediately motivates the de method! Highlight errors
  • If I sell my entire position of multiple lots the booking should be able to infer that this is umambiguous
  • Don’t allow negative units by default–modify STRICT method to disallow that.
  • Idea: stock split could be its own directive with a corresponding stream transformation, all defined in a plugin
  • Idea: wash adjusents to adjutsment account and zero balance in 2015, instead of washing to pnl account. Do this now.
  • Beancount: Review terminology used in documents, use “Clearing” instead of “Transfer.”
  • Idea: report only on last posting of a transaction!
  • In reading example of stock split, add one that maintains the original purchase date

Sqltool why interesting ?

  • Automatic schemas
  • Typed calculations
  • Single line
  • Basic operations
  • Running commands
  • Broadcast concept, for inventories
  • Data (…, “html”, 1), Html(…)
  • Sources: Beancount, ledger, hledger, CSV, dbm, Xls, xml, etc.

Taxes & Finance

  • In Canada, all trading is carried out at average lot! Since when? Adjust my Beancount file.
  • When in foreign land, how much is the base deduction (-93k?) and are you only subject to federal tax rate when you’re not there?
  • Q: Can you convert short term gains into long ones via a wash sale?

File Entries from TODO file

Urgent-ish

  • Unify bean-doctor linked and bean-doctor context; make it one, and C-u prefix triggers links, perhaps ask for which link
  • Create a demo video about ?context? and ?linked? command-line tools.
  • Add a function to fetch a particular subset of tags with a regular expression, e.g. FIRST_REGEXP(‘award’). Use this on the imported stockstatement transactions. This can be used to create a column out of a set of tags.

Documentation / Process

  • Complete the CA and handle all remaining pull requests and minor bugs.
  • Improve the Lang Syn doc right before the Unused Pad Directives by adding an intervening transaction instead of explaining how it works. It would be much clearer.
  • Write a doc (video) that explains how to debug issues, context, linked, print, etc.

Branches

  • Finish work on booking branch
  • SQL tool really ought to be extracted away from Beancount, extract as a separate project.

Small Tasks

  • A large imbalance will appear as a RoundingError, which it shouldn?t! RoundingError?s should only be inserted when a transaction balances! Run a context on a non-balancing transaction and you?ll see what I mean.
  • Build a script that can list all my cap. gains for tax-loss harvesting.
  • Modify networth script to list post-tax worth, basis + pre-tax value = post-tax value, segment based on accounts.
  • Finish that Health Care tracking document.
  • Move over the updating process to Google Spreadsheets solution.
  • Remove all double descriptions in importers in ledgerhub (when payee == narration).
  • beancount-linked is broken, missing line number, and an exception is raised: bean-doctor linked blais.beancount 57660; fix this now, rename as beancount-context-linked and add a better binding. Use this a lot more.
  • Booking: generate all combinations for just currencies, missing or not, in order to complete categorize implementation; (1) apply cost-currency = price-currency constraint, (2) infer currency from other legs, (3) infer currency from inventory.
  • Define a way to specify output format for rendering many queries, and perhaps that should include a Google doc directly that makes an upload, even if that has to call out to a Python2 subprocess (that?s fine). Use that everywhere.
  • Add ?display_tolerances? rendering override to fix bug with rendering HKD in trips, should be mirroring tolerances input syntax. Bug with Beancount: journal ?Assets:Caroline:Octopus? does not render with correct precision.
  • Remove unused auto-postings, they distract from the reporting.
  • Deal with flattening now (default should be never to flatten, never render empty fields on a continuation line for an inventory) and THEN implement CSV rendering.
  • Implement implicit GROUP-BY now.
  • Move ?bean-doctor missing-open? to ?bean-reports missing-open?, this really is a report type.
  • Merge 3 newer monnier beancount.el patches.
  • Bug: an unbalanced tag should output a correct line number in the error. Right now mistaking push/pop as push/push outputs an error with a zero line number, not useful. Need this. Fix this.
  • Complete ultipro tables from PDF parser.
  • payees: train a little model on my payee names to convert them to nice, clean payee names. Extract the data now.
  • returns2: Returns revamp + web reports.
  • booking: Implement booking proposal.
  • export: Export to spreadsheets.
  • Document arithmetic operations support
  • Export to Google Spreadsheets: Complete and document how to upload to a Trix
  • Doc on exportpf
  • Conversions: Document how open close clear works. This is really important. Use as an important the desire to pre-enter transactions from the future (e.g., Jon Stahl?s vesting events).
  • Write Intro with an explanation of open/close/clear.
  • Finish parser cleanup for Unicode.
  • Store residual in metadata, print it out in context
  • ?Transaction does not balance? message should include the tolerance value allowed.
  • Complete ?legacy? doc generation (merge assets.asc into that doc, we shouldn?t need assets.asc once that?s done, and schematize its contents) And start using this.
  • Plugins doc: Add documentation for sellgains plugin in the rest of the docs (and of all other plugins).
  • Add a section about Options in the Trading document, with these contents.
  • Implement year close via command-line.
  • Convert dividends to per-stock returns.
  • Wrap up docs? Use the framedocs script?
  • This should be the goto/ project, it should be able to do that.
  • Definitely make a teaching video? this will be useful for applications to other things!
  • Write a doc specifically on handling the vesting of shares, including the tracking of unvested shares.
  • Write doc on booking taxes NOW.
  • Complete unfinished doc on health care.
  • Convert blais.portfolio to a number of SQL scripts.
  • Why doesn?t ^c2fbec103d99 generate an error?!? Ha! It?s because it?s NOT a mixed result? only negatives. Maybe allow only positive totals with a different mode, somehow. A constraint on dictionaries, another booking method. Maybe that should be the default: don?t allow negatives at all.

Accounting

  • Measure time taken to find and process documents in ingest and only cache it if very long.
  • Document the idea of implementing a virtual SQLite table for Beancount
  • Plan is to support smart sniffers
  • Convert to schema automatically
  • Load them into SQLite tables, or make sources as adapters to virtual tables
  • Add a first step that?s a conversion

SQLite

  • Q. Can we invoke arbitrary Python functions from SQLite?
  • Look at termsql & sqlparse

Idea for talk

  • Start with a description of all the desired processes and their outputs
  • Then put down the realization: (work to do all of these separately) >>> (work to represent all inputs)
  • Black box in the middle of >>>, THAT?s what Beancount does

Contributor agreement

Idea

  • Implement a feature that generates a closing + opening transaction to be able to split one?s file to multiple files. This should include the currency conversions as well.
  • Create a script that splits up a file at a particular date and output two files! You should be able to split by date interval, and by tag (to extract all the transactions with a particular tag to a separate file).
  • Provide a plugin that summarizes all the balances from transactions marked with a particular tag, and replaces those with a single transaction. If all the balances are zero, don?t output anything. This could be used to remove the opening + closing transactions when including multiple files!

Annuities

  • Annuities w/ numpy

Starred Email - TODO

Emacs Support

  • Continue patching beancount.el from Stefan’s submitted patches: https://mail.google.com/mail/u/0/#inbox/14fdc82a04d90565
  • Try it outline-minor-mode to see if I could make Beancount be a major mode instead of a minor mode. https://mail.google.com/mail/u/0/#inbox/149d363077404909

    From Stefan: “Hmm… so basically as a glorified outline-minor-mode. Could you try out outline-minor-mode and tell me what irks you most?”

  • Look into TAB problem: https://mail.google.com/mail/u/0/#inbox/14fdc82a04d90565

    “Ben, en fait j’attendais de r?soudre le probl?me du TAB, et pour ?a j’ai besoin d’un “recipe” vu que ce que j’ai essay? fonctionne.”

  • Make it possible to bean-doctor context on a specific link or tag (thing-at-point).
  • Fix amounts alignment function, it always seems to pick up stuff from the narration. Make it a fixed column!

Open Tickets - TODO

  • Issue #130: Padding involving the same account doesn’t work incrementally (blais/beancount)
  • Issue #131: Aggregate the state of DisplayContext across all parsed files.
  • Issue #128: bean-report holdings market value incorrect if no price given initially (blais/beancount)

TaxMin Reporting

  • Read about TaxMin loss harvesting and see if I couldn’t implement a similar thing in Beancount:

    “Thank you for taking the time to write in and follow up. At this time, there is no way to view the full cost basis of all of your holdings at once at Betterment.

    Betterment uses a unique algorithm to optimize the tax-efficiency when selling shares. You can learn more about our cost basis method for sales here: https://www.betterment.com/resources/investment-strategy/taxes/lowering-your-tax-bill-by-improving-our-cost-basis-accounting-methods/

    The same data available on our PDF statements is available for download in CSV format. At this time, there is no way to provide more digits of precision in that PDF report. If you do transfer funds out of Betterment into another brokerage service, your cost basis information will be provided in full to your new financial institution upon completion of your transfer.”

    “Thank you for writing in about tax loss harvesting. To avoid wash sales, Betterment uses a proprietary Parallel Position Management system that utilizes secondary securities as a safe harbor to minimize wash sales. Sales will also be delayed if recent deposits are made until they are clear of the 30 day wash sale window.

    You can find much more information in both our White Paper (see ?Parallel Position Management?) and FAQ below:

    TLH White Paper: https://www.betterment.com/resources/tax-loss-harvesting-white-paper/ TLH FAQ: http://support.betterment.com/customer/portal/topics/670337-tax-loss-harvesting

    In general, you?ll want to avoid investing in any passive ETFs or mutual funds that track the same index as the funds in our portfolios. You can check the indexes of our investments here.

    TLH+ will automatically manage your purchases when they are made inside your Betterment taxable account, as well as your Betterment IRA.

    Target date funds, actively managed funds and individual stocks are generally not problematic to use alongside our TLH+ algorithm. To read more about using TLH+ with external accounts, please see our guide here.

    Thank you for the feedback regarding exporting data to CSV format, your friend is correct. My apologies for the earlier miscommunication; I?ll certainly pass your suggestion to our product team.

    Please let us know if you have any additional questions or if we can be of any further assistance. Thank you for being a Betterment customer, have a wonderful day.”

Adding Types

Comments from YYing re. adding types: https://bitbucket.org/blais/beancount/issues/140/add-type-annotation-to-beancount-codebase

  • Some bad design choice is more obvious when adding type annotation. E.g. https://github.com/yegle/beancount-type-stubs/blob/master/beancount/core/data.pyi#L67 the tags field of Transcation type is an Optional[Set[str]]. I think it should be a Set[str] since an empty set should be sufficient to express “there’s no tags for this transaction”. I’m not sure if there’s any difference in turns of performance
  • mypy has very limited ability to infer function return types. E.g. functions like is_balance_sheet_account in beancount/core/account_types.py, which returns foo in bar, mypy is unable to infer that the return type is a bool. I think pytype from Google have better type inference but it’s written in Python2 and I haven’t figure out how to run it against python3 code.

V3-Specific

  • Build RE/flex with Boost or PCRE2 with JIT-optimized regexps; see how much of a difference it makes; right now it’s using the default, which is claimed to be faster.