-
Notifications
You must be signed in to change notification settings - Fork 2
Parse Analyze Operate (Big) Data (on N servers in parallel)
landawn edited this page Jul 27, 2019
·
4 revisions
With the APIs provided in CSVUtil(load/import/export/parse/...), JdbcUtil(extractData/importData/parse/copy/...), IOUtil(parse/read/write/...), DataSet(count/filter/join/group/merge/...), N(...), JSONParser/XMLParser and Lambad/Stream in Java 8, It's super easy/fast to parse/analyze/operate GB/TB data stored in files (with format: CSV/JSON/XML/...) or database in single/multiple machines. RemoteExecutor is designed to run the Big/Heavy data processes on N servers in parallel.
// export the account data to CSV file from database
CSVUtil.exportCSV(file, conn, sql, 0, 1000, true, true);
// load data from CSV file.
DataSet dataset = CSVUtil.loadCSV(Account.class, file);
//find out all the account with first name ended with "6".
DataSet account6 = dataset.filter("first_name", (String fn) -> fn.endsWith("6"));
// group by last name and count it.
DataSet groupedAccount6 = account6.groupBy("last_name", "last_name", "count", Collectors.counting());
// save the result into CSV
File out = new File("./unittest/result.csv");
groupedAccount6.toCSV(out);
- How to Learn/Use the APIs correctly and efficiently
- Programming in RDBMS with Jdbc/PreparedQuery/SQLExecutor/Mapper/Dao
- JSON/XML Parser
- SQLite Executor
- SQL Executor
- SQL Builder
- SQL Mapper
- DataSet
- JdbcUtil/CSVUtil
- IOUtil
- PrimitiveList
- Profiler
- Http Client
- Web Services
- Programming in Android
- Parse/Analyze/Operate (Big) Data (on N servers in parallel)
- Code Generation
- Introduction to JDBC
- Naming Convention
- Partitioning/Distribution
- SQL/NoSQL
- Model/Entity