Skip to content

Tilda Database functions: Sizing

Laurent Hasson edited this page Sep 17, 2019 · 1 revision

<-- Common Helper Database Functions

Source code in PostgreSQL.helpers-start.sql

Sizing Helpers

Every project of a certain size needs to do sizing work and data growth projections. Tilda provides a few helper functions to help do that. They use standard Postgres catalog capabilities.

getSchemaSizes/getTableSizes

There are 2 functions:

  • getSchemaSizes: returns the sizes of all schemas, sorted by totalSize desc, so the largest schemas come first.
  • getTableSizes: returns the sizes of all tables, sorted by totalSize desc, so the largest tables come first.

They both return the same type of information, sorted by totalSize descending:

  • totalSize: a numeric for the total size of the object evaluated.
  • totalSizePretty: a text for the pg_size_pretty() version of totalSize.
  • totalSizeTable: a numeric for the size of the table data.
  • totalSizeTablePretty: a text for the pg_size_pretty() version of totalSizeTable.
  • totalSizeIndices: a numeric for the size of the index data.
  • totalSizeIndicesPretty: a text for the pg_size_pretty() version of totalSizeIndices.
  • totalSizeOther: a numeric for the size of other data (e.g., TOAST).
  • totalSizeOtherPretty: a text for the pg_size_pretty() version of totalSizeOther.
  • runningTotalSize: a numeric for the running total size (sorted by totalSize descending).
  • runningTotalSizePretty: a text for the pg_size_pretty() version of runningTotalSize.
  • runningTotalSizeTables: a numeric for the running total table data size.
  • runningTotalSizeTablesPretty: a text for the pg_size_pretty() version of runningTotalSizeTables.
  • runningTotalSizeIndices: a numeric for the running total index data size.
  • runningTotalSizeIndicesPretty: a text for the pg_size_pretty() version of runningTotalSizeIndices.
select * from Tilda.getSchemaSizes();

With some sample results like:

schemaName totalSize totalSizePretty totalSizeTable totalSizeTablePretty totalSizeIndices totalSizeIndicesPretty totalSizeOther totalSizeOtherPretty runningTotalSize runningTotalSizePretty runningTotalSizeTables runningTotalSizeTablesPretty runningTotalSizeIndices runningTotalSizeIndicesPretty
cms 23161847808 22 GB 14843625472 14 GB 8315813888 7931 MB 2408448 2352 kB 23161847808 22 GB 14843625472 14 GB 8315813888 7931 MB
public 6436872192 6139 MB 5085339648 4850 MB 1350205440 1288 MB 1327104 1296 kB 29598720000 28 GB 19928965120 19 GB 9666019328 9218 MB
population 1848131584 1763 MB 1689321472 1611 MB 158195712 151 MB 614400 600 kB 34026201088 32 GB 23004209152 21 GB 11017256960 10 GB
codesmaster 523141120 499 MB 339836928 324 MB 182509568 174 MB 794624 776 kB 34549342208 32 GB 23344046080 22 GB 11199766528 10 GB
select * from Tilda.getTableSizes();

With some sample results like:

schemaName tableName totalSize totalSizePretty totalSizeTable totalSizeTablePretty totalSizeIndices totalSizeIndicesPretty totalSizeOther totalSizeOtherPretty runningTotalSize runningTotalSizePretty runningTotalSizeTables runningTotalSizeTablesPretty runningTotalSizeIndices runningTotalSizeIndicesPretty
cms claim 13871022080 13 GB 6170230784 5884 MB 7700783104 7344 MB 8192 8192 bytes 13871022080 13 GB 6170230784 5884 MB 7700783104 7344 MB
public toto_3 6346522624 6053 MB 4995129344 4764 MB 1350148096 1288 MB 1245184 1216 kB 20217544704 19 GB 11165360128 10 GB 9050931200 8632 MB
cms claim2 5659443200 5397 MB 5658025984 5396 MB 0 0 bytes 1417216 1384 kB 25876987904 24 GB 16823386112 16 GB 9050931200 8632 MB
cms claim_test 3208855552 3060 MB 2711945216 2586 MB 496214016 473 MB 696320 680 kB 29085843456 27 GB 19535331328 18 GB 9547145216 9105 MB

getSizing

Then there is a function that helps you get quarterly-based sizing information for a table assuming there is a "created" timestamp indicating when a row was initially created. This is very useful to predict future growth based on past accumulated data. The function getSizing takes the following parameters:

  • schema_name: a varchar for the schema name of the table to be sized.
  • table_name: a varchar for the table name to be sized.
  • created_col_name: a varchar for the name of the column in that tale denoting the date/time a record was created, or some other timestamp denoting the most relevant date for this record (e.e., the date of a doctor's visit vs when the record was physically created in your database).
  • start_date: a date provided as a starting point for the analysis, i.e., a table may have data since 2010-01-01 but you only want to use data from 2018-01-01 for analysis.

🎈 NOTE: schema_name and table_name can be null to do work on all tables in a database, or all tables in a given schema. However, in that case, all tables must have the same "created_col_name" column, which may not be the case. Tables without that column will be exclued from sizing.

🎈 NOTE: Because this function does a count(*) grouped by the date_trunc('quarter', created_col_name), it may take a while to run on very large tables, and certainly if schema_name or table_name are null, large databases. Experiment before launching the function with null parameters. It is recommended to run getTableSizes() or getSchemaSizes() first to locate your largest objects. It's often the case that a few tables in a system represent the majority of the data, or are representative of data growth overall (e.g., a patient table in a healthare data warehouse, or a customer table in an eCommerce application) so you can do some projections.

The function returns the following values:

  • schemaName: a varchar for the schema name of the table sized.
  • tableName: a varchar for the table sized.
  • q: a date representing the truncated date for a quarter (e.g., 2019-01-01 for Q1 2910 or 2019-04-01 for Q2).
  • totalCount: a bigint for the total count of records in that table.
  • qCount: a bigint for the total count of records for that quarter.
  • totalSize: a bigint for the total size of that table (including indices and other data payloads such as TOAST).
  • totalSizepretty: a text for the pg_size_pretty() version of totalSize.
  • qSize: a numeric for the total size for that quarter.
  • qSizepretty: a text for the pg_size_pretty() version of qSize.
  • qPercent: a numeric for the percentage of that quarter Vs. total.
select *
  from Tilda.getSizing('cms','claim','start', '2007-01-01') 
 order by "schemaName", "tableName", "q"

with the following sample results:

schemaName tableName q totalCount qCount totalSize totalSizepretty qSize qSizepretty qPercent
cms claim 2008-01-01 17653422 1132704 13871022080 13 GB 890012270.37 849 MB 6.42
cms claim 2008-04-01 17653422 1693812 13871022080 13 GB 1330897978.38 1269 MB 9.59
cms claim 2008-07-01 17653422 1784685 13871022080 13 GB 1402300644.08 1337 MB 10.11
cms claim 2008-10-01 17653422 1810449 13871022080 13 GB 1422544481.95 1357 MB 10.26
cms claim 2009-01-01 17653422 1768470 13871022080 13 GB 1389559849.52 1325 MB 10.02
cms claim 2009-04-01 17653422 1776337 13871022080 13 GB 1395741276.03 1331 MB 10.06
cms claim 2009-07-01 17653422 1757398 13871022080 13 GB 1380860122.27 1317 MB 9.95
cms claim 2009-10-01 17653422 1690954 13871022080 13 GB 1328652329.86 1267 MB 9.58
cms claim 2010-01-01 17653422 1517785 13871022080 13 GB 1192586301.27 1137 MB 8.60
cms claim 2010-04-01 17653422 1271057 13871022080 13 GB 998721931.19 952 MB 7.20
cms claim 2010-07-01 17653422 914575 13871022080 13 GB 718619314.65 685 MB 5.18
cms claim 2010-10-01 17653422 535196 13871022080 13 GB 420525580.43 401 MB 3.03
Clone this wiki locally