-
Notifications
You must be signed in to change notification settings - Fork 989
Articles
These articles either focus on data.table (bold) or mention/use it (perhaps only briefly and you may need to search the article for "data.table"), ordered by date. If you know of an article that may be of interest to others, please add it here (**). You can also search all articles from the R blogosphere since c. 2009 on http://www.r-bloggers.com/. There is no filter applied: if the article exists and mentions data.table, positively or negatively, it is included on this page. Please watch out for benchmarks measured in milliseconds. Comparisons on such small scales often do not hold when scaled up to larger data because, for example, they over-represent call overhead and/or the dataset is so small it fits in CPU cache. A test repetition count (e.g. ntimes=
) of 5 or more is often an indication that the test data size is too small. Please check that setkey()
has been used and its time reported separately. Tutorials, slides and videos are over on the Videos & Slides page.
(**) all pages on this wiki have no write restrictions. You are encouraged to change content in this wiki yourself as you see fit. Changes will go live immediately with no oversight by any project member. If you spot any abuse, please check the edit history to see who made the edit and please inform us.
Link | Title | Author |
---|---|---|
2024.11 | Syntax conversion: data.table vs. base vs. dplyr | Vincent Arel-Bundock |
2024.11 | Data wrangling with data.table | Stata2R: Kyle Butts, Nick Huntington-Klein, and Grant McDermott |
2024.11 | Julia DataFrames.jl comparison with data.table
|
authors of DataFrames.jl docs |
2024.11 | data.table.threads | Anirban Chetia |
2024.10 | Comparing data.table reshape to duckdb and polars
|
Toby Dylan Hocking |
2024.10 | Benchmarking rolling window functions in R | Mikkel Roald-Arbøl |
2024.09 | Mutation testing for data.table
|
Anirban Chetia |
2024.08 | Collapse reshape benchmark | Toby Dylan Hocking |
2024.07 | Benchmarking a change in data.table | Toby Dylan Hocking |
2024.06 | data.table for the Google Summer of Code 2024 (Joshua Wu) | Joshua Wu |
2024.02 | Column assignment and reference semantics in data.table
|
Toby Dylan Hocking |
2024.02 | NSF project activities | Anirban Chetia |
2024.02 | new programming with data.table | John MacKintosh |
2024.02 | more .I in data.table | John MacKintosh |
2024.01 | .I in data.table | John MacKintosh |
2024.01 | Reshape performance comparison | Toby Dylan Hocking |
2023.12 | Comparing data table to frame for row subset | Toby Dylan Hocking |
2023.12 | non-equi joins in data.table | John MacKintosh |
2023.11 | Some pedagogical elements of computer programming for data science: A comparison of three approaches to teaching the R language | David Shilane, Nicole Di Crecchio, Nicole L. Lorenzetti |
2023.11 | data.table CRAN diffs: Verifying consistency between CRAN and github | Toby Dylan Hocking |
2023.10 | data.table asymptotic timings | Toby Dylan Hocking |
2023.03 | A Coding Translation to Increase the Efficiency of Programmatic Data Analyses | David Shilane |
2023.02 | Pivoting data in R with tidyr and data.table | John MacKintosh |
2022.11 | dplyr 1.1.0 is coming soon | Davis Vaughan |
2022.11 | Handling larger than memory data with {arrow} and {duckdb} | David Lucey |
2022.11 | R Package Release History: Extracting and plotting data from CRAN web site | Toby Dylan Hocking |
2022.10 | Efficiency comparison of dplyr and tidyr functions vs base R | Manuel Teodoro Tenango |
2022.08 | modifying columns in datatable with lapply | John MacKintosh |
2022.08 | Simulating data from a non-linear function by specifying a handful of points | Keith Goldfeld |
2022.06 | Timing data.table Operations | Thomas Shafer |
2022.06 | Shuffling Columns With data.table | Thomas Shafer |
2022.06 | A quirk when using data.table? | Kenneth Tay |
2022.05 | Comparing performances of CSV to RDS, Parquet, and Feather file formats in R | Tomaž Kaštrun |
2022.04 | Loading a large, messy csv using data.table fread with cli tools | David Lucey |
2022.04 | Greatly revised edition of tidyverse skeptic Original 2019.07 below: Ctrl-F "matloff" |
Norm Matloff |
2022.03 | Shiny: Fast Data Loading with fst | Philipp Probst |
2021.12 | Optimising dplyr | Tom Jemmett |
2021.11 | Should I Move to a Database? | Roel M. Hogervorst |
2021.10 | Most Starred and Forked GitHub Repos for Data Science and R | Kenneth Leung |
2021.10 | fwf without the faff | John MacKintosh |
2021.10 | Simulating the Squid Game bridge scene in R | John Paul Helveston |
2021.09 | Calculating hotel occupancy with R | John MacKintosh |
2021.08 | Exploring Stock Market Listing Mortality since 1986 | David Lucey |
2021.08 | Introducing the fastverse: An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation | Sebastian Krantz |
2021.08 | Well Well Well My Excel | John MacKintosh |
2021.08 | Cutting down code in dplyr and data.table | John MacKintosh |
2021.08 | Code performance in R: Working with large datasets | Mira Céline Klein |
2021.07 | Time Travel with py datatable 1.0 | Gregory Kanevsky |
2021.06 | DTPlyr – easier data.table for DPLYR users | Gary Hutson |
2021.06 | Stress testing reshape operations on list columns | Toby Dylan Hocking |
2021.06 | Wide-to-tall Data Reshaping Using Regular Expressions and the nc Package | Toby Dylan Hocking |
2021.05 | Update about data reshaping and visualization in R and python | Toby Dylan Hocking |
2021.05 | Hamburg RUG: A professional trading research system in R | Daniel Brandt |
2021.05 | The new R pipe | Elio Campitelli |
2021.04 | 10 Tips And Tricks For Data Scientists Vol.6 | George Pipis |
2021.04 | Not data.table vs dplyr... data.table + dplyr! | Matt Dancho |
2021.03 | Some data.table tips | John MacKintosh |
2021.03 | Data.Table – everything you need to know to get you started in R | Gary Hutson |
2021.02 | I wrote one of the fastest DataFrame libraries (hacker news) | Ritchie Vink |
2021.02 | Joins vs case whens - speed and memory tradeoffs | Thomas Mock |
2021.02 | The unequalled joy of non-equi joins | David Selby |
2021.02 | Measuring and Monitoring Arrow's Performance: Some Updated R Benchmarks (response) | Jonathan Keane & Neal Richardson |
2021.02 | Bigger Data With Ease Using Apache Arrow, (response) (rebuttal) | Neal Richardson |
2021.01 | Fast and Easy Aggregation of Multi-Type and Survey Data in R | Sebastian Krantz |
2021.01 | How to create a stock screener | Martin Bel |
2020.12 | You only need library(data.table) / 你只需要library(data.table) (in Chinese) |
Xianying Tan (@shrektan) |
2020.11 | Comparing Common Operations in dplyr and data.table | Martin Chan |
2020.11 | non-equi merge in data.table and epidemiology | Denis Mongin |
2020.10 | The ultimate R data.table cheat sheet | Sharon Machlis |
2020.10 | What is R data.table and Why is R data.table? (In Korean, 한국어) | HongDon Lee |
2020.10 | Solving small problems with data.table | John MacKintosh |
2020.10 | Python and R – Part 1: Exploring Data with Datatable | David Lucey |
2020.10 | Decomposition and Smoothing with data.table, reticulate, and spatstat | Tony ElHabr |
2020.09 | The Fastest Way To Read And Write Files In R | George Pipis |
2020.09 | The treedata.table Package | April Wright, Cristian Román-Palacios, Josef Uyeda |
2020.09 | Gotta go fast with "{tidytable}" | Bruno Rodrigues |
2020.09 | Task 2 - Retail Strategy and Analytics | Shrishti Vaish |
2020.08 | Solving small data problems with data.table | John MacKintosh |
2020.08 | Replicating .SD in Python Datatable | Samuel Oranyeli |
2020.08 | Let's Learn data.table (日本語) |
Uryu Shinya |
2020.08 | 87th TokyoR Meetup Roundup: {data.table}, Bioconductor, & more! | Ryo Nakagawara |
2020.07 | 5 handy options in R data.table’s fread | Sharon Machlis |
2020.07 | Even more reshape benchmarks | Grant McDermott |
2020.07 | RvsPython #2: Pivoting Data From Long to Wide Form | Benjamin Smith |
2020.06 | A gentle introduction to data.table | @atrebas |
2020.06 | Reshape benchmarks | Grant McDermott |
2020.06 | Selecting and Grouping Data with Python Datatable | Samuel Oranyeli |
2020.05 | dtplyr speed benchmarks | Iyar Lin |
2020.05 | Creating a data.table from C++ | David Zimmermann, Leonardo Silvestri, Dirk Eddelbuettel |
2020.04 | Data manipulation libraries: Translating between data.table, pandas, dplyr | Toby Dylan Hocking |
2020.04 | patientcounter | John MacKintosh |
2020.04 | Fastest data operations with least memory in tidy syntax | Tian-Yuan Huang |
2020.04 | W is for Write and Read Data – Fast | Sara Locatelli |
2020.03 | Use data.table the tidy way: An ultimate tutorial of tidyfst | Tian-Yuan Huang |
2020.03 | R data.table symbols and operators you should know | Sharon Machlis |
2020.03 | Variable name in functions, it's easy with datatable | Lino Galiana |
2020.02 | stringsAsFactors | Kurt Hornik |
2020.01 | Programming with data.table | John MacKintosh |
2020.01 | Blazing Fast Data Wrangling With R data.table | Thu Vu |
2020.01 | New Timings for a Grouped In-Place Aggregation Task | John Mount |
2020.01 | Base R, the tidyverse, and data.table: a comparison of R dialects to wrangle your data | Jason Mercer |
2019.12 | 4 great free tools that can make your R work more efficient, reproducible and robust | Jozef Hajnala |
2019.12 | Why I don’t use the Tidyverse | Holger K. von Jouanne-Diedrich |
2019.11 | dtplyr 1.0.0 | Hadley Wickham |
2019.10 | Using ggplot2 Inside data.table | John Lashlee |
2019.10 | Fast and Readable 'If Else' in R | Tysson Barrett |
2019.10 | Data Joins: Speed and Efficiency of dplyr and data.table | Tysson Barrett |
2019.10 | Comparing Efficiency and Speed of data.table : Adding variables, filtering rows, and summarizing by group |
Tysson Barrett |
2019.10 | Columnar File Performance Check-in for Python and R: Parquet, Feather, and FST | Wes McKinney |
2019.09 | Selecting the max value from each group, a case study: data.table | Nathan Eastwood |
2019.09 | Sentiment analysis at the Fringe, part 1 | Megan Stodel |
2019.09 | {disk.frame} is epic | Bruno Rodrigues |
2019.08 | A shallow benchmark of R data frame export/import methods | Julien Barnier |
2019.08 | The R Factor | Owen Jones |
2019.08 | Hydra Chronicles, Part V: Loose Ends | Brodie Gaslam |
2019.08 | Everyone’s Favorite Blogpost: CSV Benchmarks | Jacob Quinn |
2019.08 | No visible binding for global variable | Nathan Eastwood |
2019.08 | Why Machine Learning is more Practical than Econometrics in the Real World | Adrian Antico |
2019.08 | What’s next for the popular programming language R? | Dan Kopf |
2019.08 | Wrangling 4.6M Rows with dtplyr (the NEW data.table backend for dplyr) | Matt Dancho |
2019.08 | mlr3-0.1.0 | Patrick Schratz |
2019.07 | Hydra Chronicles, Part IV: Reformulation of Statistics | Brodie Gaslam |
2019.07 | Multiple Columns to Multiple Colums at Once | Recle Etino Vibal |
2019.07 | Long to Wide and Wide to Long Format Conversion | Giovanni Pavolini |
2019.07 | fread-benchmarks-rsuite | Alfonso R. Reyes |
2019.07 | Bayesian Power Analysis with data.table , tidyverse , and brms
|
Tyson Barrett |
2019.07 | Making .SD your best friend | José Morales |
2019.07 | data.table's cube function |
Giovanni Pavolini |
2019.07 | How to use .SD in the data.table package | Sharon Machlis |
2019.07 | Why I Chose to Learn data.table (and such related things) | Tyson Barrett |
2019.07 | What R’s most popular tools say about the state of data science | Dan Kopf |
2019.07 | data.table and Text Analysis: Analyzing the Four Gospels | Tyson Barrett |
2019.07 | Analyzing data with data.table | Giovanni Pavolini |
2019.07 | Why I love data.table | Elio Campitelli |
2019.07 | Why I like the Tidyverse | Chris Muir |
2019.07 | An opinionated view of the Tidyverse "dialect" of the R language, and its promotion by RStudio Circa this revision on GitHub was in effect at the time and widely shared; e.g. HackerNews. Revision announced 2022.04. |
Norm Matloff |
2019.06 | Learning Japanese with data.table and ggplot2 | Atrebas |
2019.06 | data.table by a dummy | John MacKintosh |
2019.06 | My Favorite data.table Feature | John Mount |
2019.06 | Coke vs. Pepsi? data.table vs. tidy? Part 2) | Beth Milhollin, Russell Zaretzki, and Audris Mockus |
2019.06 | The Psychology of Flame Wars | Edwin Thoen |
2019.06 | data.table is Much Better Than You Have Been Told | John Mount |
2019.06 | data.table is expressive and powerful | Michael Frasco |
2019.06 | How data.table's fread can save you a lot of time and memory, and take input from shell commands | Jozef Hajnala |
2019.06 | Hydra Chronicles, part III: Catastrophic Imprecision | Brodie Gaslam |
2019.06 | Hydra Chronicles, part II: beating data.table at its own game | Brodie Gaslam |
2019.06 | An Overview of Python's Datatable package | Parul Pandey |
2019.06 | For and Against data.table | Aaron Jacobs |
2019.05 | Three reasons why I use data.table | Megan Stodel |
2019.05 | Timing Working With a Row or a Column from a data.frame | John Mount |
2019.05 | Using Data Cubes with R | Kristian Larsen |
2019.05 | cranlogs 2.1.1 is on CRAN! | R-hub blog |
2019.05 | R package installation on windows considered harmful | Toby Dylan Hocking |
2019.05 | Hydra Chronicles, part I: Pixie Dust | Brodie Gaslam |
2019.04 | Using data.table with magrittr pipes: best of both worlds | Martin Chan |
2019.04 | What are the Popular R Packages? | John Mount |
2019.04 | Coke vs. Pepsi? data.table vs. tidy? Examining Consumption Preferences for Data Scientists | Audris Mockus |
2019.03 | A data.table and dplyr tour | Atrebas |
2019.03 | Dependencies. Now with badges! | Dirk Eddelbuettel |
2019.03 | Unit Tests in R | John Mount |
2019.03 | Creating blazing fast pivot tables from R with data.table - now with subtotals using grouping sets | Jozef Hajnala |
2019.02 | A strategy for faster group statistics | Brodie Gaslam |
2019.02 | Verbose data.table and uncovering hidden cedta's data table awareness decisions | Jozef Hajnala |
2018.12 | Timing Grouped Mean Calculation in R | John Mount |
2018.12 | How to sort data by one or more columns with base R, dplyr and data.table | Jozef Hajnala |
2018.12 | Smartly select and mutate data frame columns, using dict | Roman Pahl |
2018.11 | Statistics Sunday: Reading and Creating a Data Frame with Multiple Text Files | Sara Locatelli |
2018.11 | Wrangling and Manipulation of Monthly Philippine Consumer Price Index | Recle Vibal |
2018.10 | Now "fread" from data.table can read "gz" and "bz2" files directly | Pradeep Mavuluri |
2018.10 | How to perform merges (joins) on two or more data frames with base R, tidyverse and data.table | Jozef Hajnala |
2018.10 | How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest? | Jozef Hajnala |
2018.10 | Some R Guides: tidyverse and data.table Versions | John Mount |
2018.10 | Running the Same Task in Python and R | John Mount |
2018.10 | Limiting dependencies in R package development | Scott Chamberlain |
2018.09 | R Tip: Give data.table a try | John Mount |
2018.08 | Timings of a Grouped Rank Filter Task | John Mount |
2018.08 | R Tip: Consider Radix Sort | John Mount |
2018.08 | Meta-packages, nails in CRAN’s coffin | John Mount |
2018.07 | EARL London interviews – Patrik Punco, NOZ Medien | Mango Solutions |
2018.07 | Speed up your R Work | John Mount |
2018.06 | Python for data analysis… is it really simple?!? | Ferenc Bodon |
2018.06 | R and Data – When Should we Use Relational Databases? | Claude Seidman |
2018.06 | Re-referencing factor levels to estimate standard errors when there is interaction turns out to be a really simple solution | Keith Goldfeld |
2018.06 | Most Starred R Packages on GitHub | Steven Mortimer |
2018.06 | Melt and Cast The Shape of Your Data-Frame: Exercises | sindri |
2018.06 | Sharpening The Knives in The data.table Toolbox: [Exercises] [Solutions] | sindri |
2018.06 | rqdatatable: rquery Powered by data.table | John Mount |
2018.04 | An R vlookup? Not so silly idea | Hanjo Oden |
2018.04 | Benchmarking the six most used manipulations for data.tables in R | Opremic |
2018.04 | Down the AUC Rabbit Hole and into Open Source: Part 2 | Michael Frasco |
2018.04 | Down the AUC Rabbit Hole and into Open Source: Part 1 | Michael Frasco |
2018.04 | Quick R Tutorial | Frank Erickson |
2018.03 | pandas vs. data.table – A study of data-frames – Part 2 | Tobias Krabel |
2018.02 | Retail analytics: from hours to seconds using R | Bharani Subramaniam |
2018.02 | pandas vs. data.table – A study of data-frames | Christian Moreau |
2018.02 | Julia vs R vs Python: string-sort performance + an unfinished journey to optimizing Julia's performance | ZJ |
2018.02 | dplyr, (mc)lapply, for-loop and speed | Mike Spencer |
2018.02 | Speeding up spatial analyses by integrating sf and data.table : a test case |
Lorenzo Busetto |
2018.02 | Packages for Getting Started with Time Series Analysis in R | Abraham Mathew |
2018.02 | DataExplorer: Fast Data Exploration With Minimum Code | Boxuan Cui |
2018.01 | Supercharge your R code with wrapr | John Mount |
2018.01 | Tidyverse and data.table, sitting side by side… and then base R walks in | Iñaki Úcar |
2018.01 | Tidyverse and data.table, sitting side by side (Part 1) | Dirk Eddelbuettel |
2018.01 | Base R can be Fast | John Mount |
2018.01 | Lightning fast serialization of datasets using the fst package | Mark Klik |
2018.01 | rquery: Fast Data Manipulation in R | John Mount |
2017.12 | A tour of the data.table package by creator Matt Dowle | David Smith |
2017.12 | More Pipes in R | John Mount |
2017.12 | Team Rtus wins Munich Re Datathon with mlr | Jann Goschenhofer |
2017.12 | Correlated log-normal chain-ladder model | Markus Gesmann |
2017.11 | How we built a Shiny App for 700 users | Olga Mierzwa-Sulima |
2017.11 | An empirical study of group-by strategies in Julia | ZJ |
2017.11 | Using data.table and Rcpp to scale geo-spatial analysis with sf | Tim Appelhans |
2017.11 | Creating integer64 and nanotime vectors in C++ | Dirk Eddelbuettel |
2017.10 | The Impressive Growth of R | David Robinson |
2017.10 | Data.Table by Example – Part 3 | atmathew |
2017.09 | Speed of data manipulations in Julia vs R | ZJ |
2017.09 | Data.Table by Example – Part 2 | atmathew |
2017.09 | Data.Table by Example – Part 1 | atmathew |
2017.09 | Beyond the basics of data.table: Smooth data exploration | Sindri |
2017.09 | Strategies to Speed-up R Code | Selva Prabhakaran |
2017.08 | Is the Hadleyverse the only option? | Billy Fung |
2017.08 | Basics of data.table: Smooth data exploration | Sindri |
2017.08 | Polygenic Risks Scores with data.table in R | Sahir Rai Bhatnagar |
2017.08 | July(ish) Update | John MacKintosh |
2017.08 | R for System Adminstration | Dirk Eddelbuettel |
2017.07 | Compare data.table pipes and magrittr pipes | Guanglai Li |
2017.06 | data.table tutorial (with 50 examples) | Deepanshu Bhalla |
2017.06 | The data.table R Package Cheat Sheet | Karlijn Willems |
2017.06 | Data Manipulation with data.table (part 2) | Biswarup Ghosh |
2017.06 | R in pRoduction: theRe be dRagons! | Tim Sweetser and Kyle Schmaus |
2017.06 | Improving Zillow’s Zestimate with 36 Lines of Code | Eduardo Ariño de la Rubia |
2017.06 | Data Manipulation with data.table (part 1) | Biswarup Ghosh |
2017.05 | plotly 4.7.0 now on CRAN | Carson Sievert |
2017.05 | R⁶ — Idiomatic (for the People) | Bob Rudis |
2017.05 | Reading/writing biggish data, revisited | Karl Broman |
2017.05 | dplyr in context | John Mount |
2017.05 | Everyone knows that loops in R are to be avoided but vectorization is not always possible | Keith Goldfeld |
2017.04 | R code to accompany Real-World Machine Learning (Chapter 6): Exploring NYC Taxi Data | Paul Adamson |
2017.04 | Fast data loading from files to R | Olga Mierzwa-Sulima |
2017.03 | Data Manipulation with Python Pandas and R Data.Table | Fisseha Berhane |
2017.03 | Fast data lookups in R: dplyr vs data.table | Marek Rogala |
2017.02 | Fitting logistic regression on 100gb dataset on a laptop | Dmitriy Selivanov |
2017.02 | Large data, feature hashing and online learning | Dmitriy Selivanov |
2017.02 | Moving largish data from R to H2O - spam detection with Enron emails | Peter Ellis |
2017.01 | Discover your data (XGBoost vignette) | Tianqi Chen, Tong He, Michaël Benesty, Yuan Tang |
2017.01 | fst: Fast serialization of R data frames | David Smith |
2017.01 | fst: Lightning Fast Serialization of Data Frames | Mark Klik |
2017.01 | R to the Rescue | John Mackintosh |
2016.12 | Using R to prevent food poisoning in Chicago | David Smith |
2016.12 | Behind the scenes of CRAN | Matt Dowle |
2016.12 | nanotime 0.0.1: New package for Nanosecond Resolution Time for R | Dirk Eddelbuettel |
2016.12 | Does replyr::let work with data.table? | John Mount |
2016.12 | data.table: Where Have You Been All My Life? | JoAnn Rudd Alvarez |
2016.12 | Organize your data manipulation in terms of “grouped ordered apply” | John Mount |
2016.12 | Comparing a MySQL Query with a Data Table in R | Douglas Rice |
2016.11 | data.table: squeeze the maximum speed when using data in R | Stanislav Chistyakov |
2016.10 | Data Wrangling: Quick Guide for dplyr, data.table and R build-in data.frame | Vincent Cao |
2016.09 | This Machine Learning Project on Imbalanced Data Can Add Value to Your Resume | Manish Saraswat |
2016.09 | Rolling a join | Will Rogers |
2016.07 | Winning approach of the Facebook V Kaggle competition | Tom Van de Wiele |
2016.07 | New release of partools package | Norm Matloff |
2016.07 | Bad Coder, Bad Coder! | Norm Matloff |
2016.06 | Intro to the data.table package | Steve Pittard |
2016.06 | Boost Your Data Munging with R | Jan Gorecki |
2016.06 | Improving Season on Season | James P. Curley |
2016.06 | Understanding data.table Rolling Joins | Robert Norberg |
2016.05 | From a (set.)seed grows a mighty dataset | Jonathan Carroll |
2016.05 | Feather: fast, interoperable data import/export for R | David Smith |
2016.05 | Best packages for data manipulation in R | Fisseha Berhane |
2016.05 | My Two favorite Packages for Data Manipulation in R | Fisseha Berhane |
2016.05 | Use H2O and data.table to build models on large data sets in R | Manish Saraswat |
2016.05 | The R Data I/O Shootout | Eduardo Ariño de la Rubia |
2016.05 | Red herring bites | Matt Dowle |
2016.05 | data.table() vs data.frame() – Learn to work on large data sets in R | Manish Saraswat |
2016.04 | Feather: it's about metadata | Wes McKinney |
2016.04 | Fast csv writing for R | Matt Dowle |
2016.04 | I'll Keep Using R | Michael Ekstrand |
2016.04 | data.table objects should not be considered data.frame instances in R [retracted] | John Mount |
2016.04 | Learning R in Seven Simple Steps | Martijn Theuwissen |
2016.04 | Collapsing lists of data.frames with data.table | Steph Locke |
2016.04 | Working with databases in R | Fisseha Berhane |
2016.03 | Data table exercises: keys and subsetting | Han de Vries |
2016.03 | Performing SQL selects on R data frames | Fisseha Berhane |
2016.02 | Read from hdfs with R. Brief overview of SparkR | Dmitriy Selivanov |
2016.02 | Up to code? An algorithm is helping Chicago health officials predict restaurant safety violations (featured on TV at 06:40). [Tweet] [Code] | PBS NewsHour |
2016.01 | Strategies to Speedup R Code | Selva Prabhakaran |
2015.12 | Our R package roundup 2015 | Christoph Safferling |
2015.12 | Who’s downloading the forecast package? | Rob J Hyndman |
2015.12 | Solve common R problems efficiently with data.table | Jan Gorecki |
2015.11 | Efficient aggregation (and more) using data.table | David Kun |
2015.11 | Scaling data.table with index | Jan Gorecki |
2015.11 | H2O World 2015 – Day 2 Highlights | Anmol Rajpurohit, KDnuggets |
2015.11 | H2O World 2015 | Joseph Rickert |
2015.11 | H2O.ai raises $20m series B to capitalize on rapid open source machine-learning growth | Matt Aslett, 451 Research |
2015.10 | R and Impala: it's better to KISS than using Java | Gergely Daroczi |
2015.10 | R: data.table – Finding the maximum row | Mark Needham |
2015.09 | Querying a 20 million line CSV file – data.table vs data frame | Mark Needham |
2015.09 | Data ergonomics with data.table, iHub Nairobi, with supporting materials | Henk Harmsen |
2015.09 | R Stories from the Trenches [Video] [Slides] | Szilard Pafka |
2015.09 | Advanced Tips and Tricks with data.table | Andrew Brooks |
2015.08 | data.table cookbook | Steph Locke |
2015.07 | Overlap joins in R: a speed comparison with packages sqldf and data.table | Zev Ross |
2015.06 | Data Warehousing with R | Jan Gorecki |
2015.06 | Auditing data transformation | Jan Gorecki |
2015.06 | Back from R/Finance in Chicago | Markus Gesmann |
2015.05 | Fast data munging in R | Alexander Konduforov |
2015.05 | No THIS Is How You dplyr and data.table! | Jeffrey Horner |
2015.05 | Comparing data frames, data.table and dplyr with random walks | David Smith |
2015.05 | Working with "large" datasets, with dplyr and data.table | Arthur Charpentier |
2015.04 | Comparing the execution time between foverlaps and findOverlaps [data.table vs GenomicRanges] | Katarzyna Wręczycka |
2015.04 | Open Source Business Intelligence: Then and Now | Steve Miller |
2015.04 | Mapping Flows in R with data.table and lattice | Oscar Perpiñán Lamigueiro |
2015.03 | Need for Processing Speed: data.table | OpenAnalytics |
2015.03 | Getting Data From An Online Source | Robert Norberg |
2015.02 | A data.table R tutorial by DataCamp: intro to DT[i, j, by] | DataCamp |
2015.02 | Minimal example for joining data.tables | Markus Gesmann |
2015.01 | Using the microbenchmark package to compare the execution time of R expressions | Stephen Turner |
2015.01 | Sessionizing Log Data Using data.table | Randy Zwitch |
2015.01 | R in Business Intelligence | Jan Gorecki |
2014.12 | dplyr and a very basic benchmark | Szilard Pafka |
2014.12 | JOINing data in R using data.table | Ronald Stalder |
2014.12 | Cheat Sheets for Data Science | Steve Miller |
2014.11 | Partying R Style with Sqor Sports, R on Azure, and data.table | Joseph Rickert |
2014.11 | The data.table Cheat Sheet | DataCamp |
2014.11 | Some R Highlights from H20 World | Joseph Rickert |
2014.10 | Complete data.table tutorial: data analysis the data.table way | DataCamp |
2014.10 | data.table University | Steve Miller |
2014.10 | Visualising the seasonality of Atlantic windstorms | Markus Gesmann |
2014.08 | Scaling up data frames | Ben Lorica |
2014.08 | data.table for R | Grant Rettke |
2014.08 | MongoDB – State of the R | Raffael Vogler |
2014.08 | VIDEO Matt Dowle's data.table talk from useR! 2014 | Eduardo Ariño de la Rubia |
2014.08 | Pro Grammar and Devel Hoper | Romain Francois |
2014.08 | Faster CSV Import with R | Phill Clarke |
2014.07 | 10 R Packages to Win Kaggle Competitions | Xavier Conort |
2014.07 | R – Data.Table Rolling Joins | Ben Gorman |
2014.07 | Dependencies of popular R packages | Andrie de Vries |
2014.07 | 2014 useR! conference, days 1-2 | Karl Broman |
2014.06 | The joy of joining data.tables | Markus Gesmann |
2014.06 | Concatenating a list of data frames | Andrew |
2014.05 | R/Finance 2014 | Steve Miller |
2014.05 | Working with large data sets in R - data.table and dcast | Kamil Bartocha |
2014.05 | Reading large data tables in R | Fabio Marroni |
2014.04 | Exploring US healthcare data | Vik Paruchuri |
2014.04 | data.table vs dplyr in split apply combine style analysis | Brodie G |
2014.02 | Dueling R and Python Followup | Steve Miller |
2014.02 | Efficiency of Importing Large CSV Files in R | statcompute |
2014.01 | Benchmark on baseball data: dplyr (0.1) and data.table (1.8.10) [tweet] | Arun Srinivasan and Matt Dowle |
2014.01 | R: the good parts | Jose Quesada |
2014.01 | Two of my favorite data.table features | Brandon Le Beau |
2014.01 | When I use plyr/dplyr/data.table | Educate-R |
2013.12 | Review: Kölner R Meeting 13 December 2013 | Markus Gesmann |
2013.09 | A speed comparison of plyr, data.table and dplyr | Jake Russ |
2013.08 | An R function like “order” from Stata | Ananda Mahto |
2013.07 | Fig Data: 11 Tips on How to Handle Big Data in R (and 1 Bad Pun) | Ulrich Atz |
2013.07 | A Bottom-up Start on Big Data Analytics | Steve Miller |
2013.06 | Simulating Map-Reduce in R for Big Data Analysis Using Flights Data | Jitender Aswani |
2013.06 | Improve The Efficiency in Joining Data with Index | statcompute |
2013.04 | FasteR! HigheR! StrongeR! – A Guide to Speeding Up R Code for Busy People | Noam Ross |
2013.04 | Using data.table for binning | Oscar Perpiñán Lamigueiro |
2013.03 | RMark: data.table merge vs core merge | Xachriel |
2013.02 | data.table or data.frame? | DataParadigms |
2013.01 | Another Benchmark for Joining Two Data Frames | statcompute |
2013.01 | Efficiecy of Extracting Rows from A Data Frame in R | statcompute |
2013.01 | Efficiency in Joining Two Data Frames | statcompute |
2012.12 | Surprising Performance of data.table in Data Aggregation | Wensui Liu |
2012.11 | Data.table rocks! Data manipulation the fast way in R | Markus Gesmann |
2012.10 | Generate a panel data.table or data.frame to fill with data | Thiemo Fetzer |
2012.06 | Transforming subsets of data in R with by, ddply and data.table | Markus Gesmann |
2012.06 | Access data quickly and easily: data.table package | Anna Longari |
2012.05 | data.table 1.8.1 - Now allows numeric columns and big-number (via bit64) in keys! | Branson Owen |
2012.03 | R code for Chapter 2 of Non-Life Insurance Pricing with GLM | Allan Engelhardt |
2012.02 | Elegant & fast data manipulation with data.table | Carl Boettiger |
2012.01 | Say it in R with "by", "apply" and friends | Markus Gesmann |
2011.08 | Comparison of ave, ddply and data.table | Paul Hiemstra |
2011.04 | Data Aggregation in R: plyr, sqldf and data.table | Hayward Godwin |
2011.03 | Applying functions on groups: sqldf, plyr, doBy, aggregate or data.table ? | altuna |
2011.03 | Fast(ish) extraction of exon locations from a BED12 file using data.table | altuna |
2011.03 | data.table: an R package everyone should use | Jason |
2011.02 | By-Group Processing, the R data.table and the Power of Open Source | Steve Miller |