Features and Roadmap

High-performance Database

Columnar in-memory engine for extremely high throughput and low latency
Columnar hybrid (in-memory and disk-based) engine delivers fast performance for data warehouses with vast amount of data
Flexible partition schemes: value, range, list, hash and composite partitions
Support millions of partitions for a table
In-database analytics: complicated computing can be executed within the database. Significantly reduces time for data transfer.
Native support for processing time series data with up to nanosecond precision
Standard SQL with enhancements, such as panel data processing, bi-temporal joins (asof join, window join), window functions, pivoting, composite columns, etc.
Table co-location for fast join
Support data compression
Support dynamically increasing table columns

Support dynamically increasing table columns

Highly expressive. Support imperative programming, functional programming, vector programming, SQL programming, and RPC (remote procedure call) programming.
Easy to learn. The syntax is very similar to SQL and Python.
About 600 built-in functions for various data types (number, temporal, string), data structures (vector, matrix, set, dictionary, table), and system calls (file, database, distributed computing).
Extended functionalities with user defined functions and plugins

Distributed Computing

High speed distributed computing through in-memory engine, data localization, fine-grained data partitioning, and parallel computing.
Offer various built-in computing models such as pipeline, map-reduce, and iterative computing.
Provide snapshot isolations on computing of distributed dynamic data
Boost system throughput by sharing data copies in memory among multiple jobs
Efficient programming for distributed computing. Can write script on one node to execute on the entire cluster instantly without the need of compilation and deployment.
Automatic data replica management for load balance and fault tolerance with embedded distributed file system
Convenient horizontal scaling on both storage capacity and computing capacity

Real-time Data Streaming

Adopt publish/subscribe framework. Support chained subscription.
First-class support for stream-table duality. Publishing a message is equivalent to inserting a row to a table. Can use SQL queries on local or distributed streaming data.
Deliver messages with sub-millisecond latency
Update historical data warehouse with live data with sub-second delay.
Replay historical messages from arbitrary offset.
Provide configurable building blocks (e.g. partition, worker, queue) for traffic control and performance tuning

System Management and APIs/plugins

Embedded web interface for cluster management, performance monitoring and data access.
System monitoring via built-in functions, web interface, or Prometheus.
Portable IDE for data analysis.
Programming APIs for C++, C#, Java, Python, R, JavaScript and Excel.
User access control on tables and functions
Run scheduled tasks of user-defined functions

We are working on the following features: