datafuselabs/databend

Say Hello To DataBend The Open Source Cloud Warehouse for Everyone

Databend is inspired by apache-arrow and allows you to scale your cloud warehousing, using off the shelf open source stacks & the ultra fast processing speeds you come to expect from say Bigquery

A Modern Cloud Data Warehouse with the Elasticity and Performance both on Object Storage

Documentation | Benchmarking | Roadmap(v0.8)


  • What is Databend?
  • Design Overview
    • Meta Service Layer
    • Compute Layer
    • Storage Layer
  • Getting Started
  • Community
  • Roadmap

What is Databend?

Databend is an open-source Elastic and Workload-Aware modern cloud data warehouse.

Databend uses the latest techniques in vectorized query processing to allow you to do blazing-fast data analytics on object storage(S3, Azure Blob or MinIO).

  • Instant Elasticity

    Databend completely separates storage from compute, which allows you easily scale up or scale down based on your application's needs.

  • Blazing Performance

    Databend leverages data-level parallelism(Vectorized Query Execution) and instruction-level parallelism(SIMD) technology, offering blazing performance data analytics.

  • Support for Semi-Structured Data

    Databend supports ingestion of semi-structured data in various formats like CSV, JSON, and Parquet, which are located in the cloud or your local file system; Databend also supports semi-structured data types: ARRAY, MAP, JSON, which is easy to import and operate on semi-structured.

  • MySQL/ClickHouse Compatible

    Databend is ANSI SQL compliant and MySQL/ClickHouse wire protocol compatible, making it easy to connect with existing tools(MySQL Client, ClickHouse Client, Vector, DBeaver, Jupyter, JDBC, etc.).

  • Easy to Use

    Databend has no indexes to build, no manual tuning required, no manual figuring out partitions or shard data, it’s all done for you as data is loaded into the table.

Design Overview

This is the high-level architecture of Databend. It consists of three components:

  • meta service layer
  • compute layer
  • storage layer

Meta Service Layer

The meta service is a layer to service multiple tenants. This layer implements a persistent key-value store to store each tenant's state. In the current implementation, the meta service has many components:

  • Metadata, which manages all metadata of databases, tables, clusters, the transaction, etc.
  • Administration, which stores user info, user management, access control information, usage statistics, etc.
  • Security, which performs authorization and authentication to protect the privacy of users' data.

The code of Meta Service Layer mainly resides in the metasrv directory of the repository.

Compute Layer

The compute layer is the layer that carries out computation for query processing. This layer may consist of many clusters, and each cluster may consist of many nodes. Each node is a computing unit and is a collection of components:

  • Planner

    The query planner builds an execution plan from the user's SQL statement and represents the query with different types of relational operators (such as Projection, Filter, Limit, etc.).

    For example:

  • Optimizer

    A rule-based optimizer, some rules like predicate push down or pruning of unused columns.

  • Processors

    A Pull&Push-Based query execution pipeline, which is built by planner instructions. Each pipeline executor is a processor(such as SourceTransform, FilterTransform, etc.), it has zero or more inputs and zero or more outputs, and connected as a pipeline, it also can be distributed on multiple nodes judged by your query workload.

    For example:

Node is the smallest unit of the compute layer. A set of nodes can be registered as one cluster via namespace. Many clusters can attach the same database, so they can serve the query in parallel by different users. When you add new nodes to a cluster, the currently running computational tasks can be scaled(known as work-stealing) guarantee.

The Compute Layer codes are mainly in the query directory.

Storage Layer

Databend stores data in an efficient, columnar format as Parquet files. Each Parquet file is sorted by the primary key before being written to the underlying shared storage. For efficient pruning, Databend also creates indexes for each Parquet file:

  • min_max.idx The index file stores the minimum and maximum value of this Parquet file.
  • sparse.idx The index file store the <key, parquet-page> mapping for every [N] records granularity.

With the indexes, we can speed up the queries by reducing the I/O and CPU costs. Imagine that Parquet file f1 has min_max.idx of [3, 5) and Parquet file f2 has min_max.idx of [4, 6) in column x if the query predicate is WHERE x < 4, only f1 needs to be accessed and processed.

Getting Started

Deployment

  • How to Deploy Databend With MinIO
  • How to Deploy Databend With AWS S3
  • How to Deploy Databend With Azure Blob Storage
  • How to Deploy Databend With Wasabi Object Storage
  • How to Deploy Databend With Scaleway OS
  • How to Deploy Databend With Tencent COS
  • How to Deploy Databend With Alibaba OSS
  • How to Deploy Databend With QingCloud QingStore
  • How to Deploy a Databend Local Cluster With MinIO
  • How to Deploy a Databend K8s Cluster With MinIO

Connect

  • How to Connect Databend With MySQL Client
  • How to Connect Databend With ClickHouse Client
  • How to Connect Databend With DBeaver SQL IDE
  • How to Execute Queries in Python
  • How to Query Databend in Jupyter Notebooks
  • How to Execute Queries in Golang
  • How to Work With Databend in Node.js

Users

  • How to Create a User
  • How to Grant Privileges to a User
  • How to Revoke Privileges From a User
  • How to Create a Role
  • How to Grant Privileges to a Role
  • How to Grant Role to a User
  • How to Revoke Role From a User

Tables

  • How to Create a Database
  • How to Drop a Database
  • How to Create a Table
  • How to Drop a Table
  • How to Rename a Table
  • How to Truncate a Table

Views

  • How to Create a View
  • How to Drop a View
  • How to Alter a View

Load Data

  • How to Load Data From Local File System
  • How to Load Data From Amazon S3
  • How to Load Data From Databend Stages
  • How to Load Data From MySQL

Use Case

  • Analyzing Github Repository With Databend
  • Analyzing Nginx Access Logs With Databend
  • User Retention Analysis With Databend
  • Conversion Funnel Analysis With Databend

Performance

  • How to Benchmark Databend

Community

For general help in using Databend, please refer to the official documentation. For additional help, you can use one of these channels to ask a question:

  • Slack (For live discussion with the Community)
  • Github (Feature/Bug reports, Contributions)
  • Twitter (Get the news fast)
  • Weekly (A weekly newsletter about Databend)
  • I'm feeling lucky (Pick up a good first issue now!)

Roadmap

  • Roadmap v0.8
  • Roadmap 2022

License

Databend is licensed under Apache 2.0.

Acknowledgement

  • Databend is inspired by ClickHouse and Snowflake, its computing model is based on apache-arrow.
  • The documentation website hosted by Vercel.
Issues

Collection of the latest Issues

Xuanwo

Xuanwo

Comment Icon0

I will track the progress of i18n support of databend.rs

In general, we will adopt the features provided by docusaurus: i18n - Introduction and crowdin.

In this workflow, we will:

  • Upload all text that needs to be translated (by crowdin)
  • Translate text in crowdin
  • Commit back to databend (by crowdin)

Current status

Checklist

flaneur2020

flaneur2020

C-feature
Comment Icon0

Summary

distributed tracing is an effective tools to diagnose performance issues.

but recording tracing spans have a performance impact, can we have a dynamic setting about enable tracing spans or not? thus we can enable the tracing when we need diagnose performance issues, while disables it in most times to reduce the overall performance impact.

youngsofun

youngsofun

C-bug
Comment Icon0

Search before asking

  • I had searched in the issues and found no similar issues.

Version

main

What's Wrong?

the ci run for a long time(55m 59s), cancel it, see this in the end of the log

https://github.com/datafuselabs/databend/runs/6551932830?check_suite_focus=true

test api::rpc::flight_dispatcher::test_run_shuffle_action_with_scatter has been running for over 60 seconds

How to Reproduce?

ci

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
dantengsky

dantengsky

C-feature
Comment Icon0

Summary

Provides a way of marking historical snapshots invisible, so that the old snapshots( and maybe the data it referenced) can fade away gradually.


Basic desc of functionalities:

Marks the latest visible snapshot of the given table.

  • A system configuration, let's say table_retention_time: Duration,
  • A system procedure, which marks the latest visible snapshot of a table
    • by insert/update a specified key of the KV service
    • TimeTravel of table data will respect this mark

NOTE: The query nodes work on their local clocks, which is NOT perfectly synced


basic idea of impl:

  • provides a system procedure, let's say call system$retention_mark([database_name,] table_name)
    • grab meta data of the table specified
    • check if key LATEST_VISBLE_SNAPHOST of the give table exist LATEST_VISBLE_SNAPHOST/<tid> -> timestamp
      • if it exist and value of it is less than (now() + table_retention_time) try to update it to (now() + table_retention_time)
      • if it does not exist try to insert the kv pair
    • And of course, the mutations should be executed in a kv transaction
      • the most important invariant of this operation value of LATEST_VISBLE_SNAPHOST/<tid> should only be increased

Notes:

  • if database_name is not provided, use the context's current database name
  • A "hurry" marker, whose clock is crazily ahead of time, may mark the LATEST_VISBLE_SNAPHOST "incorrectly"
    • Have to live with it, hoping it is not too crazy : )

      e.g. if the clock is two months ahead of time. The history of the table may be not accessible in the next 2 months.

      To intimidate this situation: The value of LATEST_VISBLE_SNAPHOST/<tid> could be changed to the timestamp of the snapshot, by navigating to the snapshot S at (now() + table_retention_time). Thus, snapshots generated after S, could be accessible, if clocks go back to normal.

    • The "current snapshot" referenced by the KV meta, is always visible

ZhiHanZ

ZhiHanZ

C-feature
Comment Icon2

Summary We need to configure some fixed bootstrapping process for each tenant, currently we heavily rely on call command which restricted its extensibility, we could incorporate those sql logics into a file and pass it as a secret for each tenant's query node

dantengsky

dantengsky

C-feature
Comment Icon0

Summary:

Enhance the select query, so that a time point can be specified, sth like this

select * from t at(offset => -2) as t where ...

https://docs.snowflake.com/en/sql-reference/constructs/at-before.html

A basic idea of impl (not a strict requirement, any other functionality equivalent impls are ok)

  • enhance the corresponding parser components encode the timepoint specification (the at(offset...) parts) into DfQueryStatement

  • carries the timepoint spec all the way to ToReadDataSourcePlan::read_plan_with_catalog

    • detects if the underlying table instance support time travel.
    • if it does, navigate there, and construct the ReadDataSourcePlan accordingly

    also see https://github.com/datafuselabs/databend/issues/5514

Versions

Find the latest versions by id

v0.7.59-nightly - May 24, 2022

What's Changed

Bug Fixes

  • fix(function): fix object_keys array type (#5532) by @b41sh
  • fix(metasrv): Fix env config not loaded correctly (#5552) by @Xuanwo
  • bugfix(executor): Fix incorrect context usage (#5539) by @leiysky

Build/Testing/CI

  • feat(function): Support variant as function (#5442) by @b41sh
  • fix flaky test (#5544) by @TCeason
  • ci: Fix crowdin not configure correctly (#5546) by @Xuanwo
  • feat(function): Support variant max/min functions (#5525) by @b41sh

Documentation

  • feat(function): Support variant as function (#5442) by @b41sh
  • docs: Fix get started is missing (#5554) by @Xuanwo
  • docs: Fix sidebar is missing (#5549) by @Xuanwo
  • website: Add i18n support (#5545) by @Xuanwo

Features

  • Feature: Support TRIM function in new planner (#5541) by @ygf11
  • Feature: add metasrv time travel functions (#5468) by @lichuang
  • feat(function): Support variant as function (#5442) by @b41sh
  • feat: snapshot timestamp & navigation (#5535) by @dantengsky
  • feat(function): Support variant max/min functions (#5525) by @b41sh

Others

  • Feature: add metasrv time travel functions (#5468) by @lichuang
  • feat(function): Support variant max/min functions (#5525) by @b41sh

v0.7.58-nightly - May 23, 2022

What's Changed

Bug Fixes

  • fix(planner): Fix wrong result of aggregate in subquery (#5538) by @leiysky
  • fix(scripts): deploy minio in k8s failed (#5526) by @hantmac

Features

  • feature(planner): Common tree structure formatter for plan display (#5512) by @leiysky

Improvement

  • improvement(planner): use DataBlock::gather_blocks in hash join (#5534) by @xudong963
  • improvement(planner): hash join ~6x performance improvement (#5497) by @xudong963

Others

  • feature(planner): Common tree structure formatter for plan display (#5512) by @leiysky

Performance Improvement

  • improvement(planner): use DataBlock::gather_blocks in hash join (#5534) by @xudong963

v0.7.57-nightly - May 22, 2022

What's Changed

Features

  • feature(planner): Translate subquery into apply operator (#5510) by @leiysky

Others

  • feature(planner): Translate subquery into apply operator (#5510) by @leiysky

v0.7.56-nightly - May 21, 2022

What's Changed

Bug Fixes

  • fix(array): fix incorrect column meta of Array (#5507) by @sundy-li

Documentation

  • docs: Revert changes introduced in #5493 (#5494) by @Xuanwo

v0.7.55-nightly - May 20, 2022

What's Changed

Bug Fixes

  • fixes(processor): fix server hang when parallel execute query (#5482) by @zhang2014
  • fix(httphandler): req should return as soon as results is exhausted. (#5462) by @youngsofun

Build/Testing/CI

  • ci: Add issues labeled A-storage into storage project (#5486) by @Xuanwo
  • feat(function): Support compare variant with other data types (#5463) by @b41sh

Documentation

  • doc(deploy): add sql statement terminator (#5492) by @ZuoFuhong

Features

  • feat(table statistics): add statistics to TableMeta (#5476) by @dantengsky
  • chore: Towards the next nightly (#5478) by @Xuanwo
  • feat(fuse): add system$clustering_information function (#5426) by @zhyass
  • feat(function): Support compare variant with other data types (#5463) by @b41sh

Improvement

  • chore(parser): delete redundancy reserved function token (#5490) by @fkuner
  • chore(datavalues): replace todos with ErrorCode (#5475) by @sundy-li

Others

  • feat(table statistics): add statistics to TableMeta (#5476) by @dantengsky

v0.7.54-nightly - May 19, 2022

What's Changed

Build/Testing/CI

  • deps(tests): Fix toml is missing (#5470) by @Xuanwo

Features

  • feat(function): support object_keys function (#5461) by @fkuner

Improvement

  • chore: remove unneccesary user creation during tenant boot call (#5471) by @ZhiHanZ

v0.7.53-nightly - May 19, 2022

What's Changed

Bug Fixes

  • fix(function): fix retention aggregation coredump bug (#5450) by @fkuner

Build/Testing/CI

  • chore: sqllogic test ci (#5464) by @ZeaLoVe
  • chore: add feature flagging for logic tst (#5460) by @ZhiHanZ
  • chore: allow additonal headers in sql logic test http handler (#5457) by @ZhiHanZ

Documentation

  • docs: Add I'm feeling lucky (#5459) by @Xuanwo

Features

  • MySQL Handler Kill Query (#5448) by @TCeason

Improvement

  • chore: allow additonal headers in sql logic test http handler (#5457) by @ZhiHanZ

Others

  • chore: allow additonal headers in sql logic test http handler (#5457) by @ZhiHanZ

v0.7.52-nightly - May 18, 2022

What's Changed

Build/Testing/CI

  • chore: Move all databend generted folders into .databend (#5446) by @Xuanwo

Features

  • feature(planner): Enhance GROUP BY semantic check (#5431) by @leiysky

Improvement

  • fix(metasrv): Display friendly error if not started with valid flags (#5443) by @Xuanwo
  • [Improving] remove unnecessary info log (#5445) by @ariesdevil

v0.7.51-nightly - May 18, 2022

What's Changed

Bug Fixes

  • fix(query): Fix test_query_log when RUST_BACKTRACE is not set (#5440) by @Xuanwo
  • fix(functions): make aggregate function sum/avg/min/max support null … (#5436) by @sundy-li
  • fix(data type): update arrow2 to fix array nullable write (#5429) by @b41sh

Features

  • feat(function): support date_add for new parser (#5419) by @fkuner
  • feature(format): refactor output format (#5422) by @sundy-li
  • Feature: Support DISTINCT in new planner (#5410) by @ygf11

Improvement

  • refactor(meta/config): Adapt RFC Config Backward Compatibility (#5421) by @Xuanwo
  • chore: replace for {if...} with iterator find (#5435) by @xudong963

v0.7.50-nightly - May 17, 2022

What's Changed

Bug Fixes

  • fixes(insert): fix drop dispatcher when commit insert query (#5424) by @zhang2014
  • fix(parser): wrong error code. (#5414) by @youngsofun
  • fixes(handler): fix clickhouse handler dead loop when error (#5412) by @zhang2014

Build/Testing/CI

  • feat(test): Sql logic test framework improve (#5416) by @ZeaLoVe

Documentation

  • feat(test): Sql logic test framework improve (#5416) by @ZeaLoVe

Features

  • feat(planner): support using and natural for join (#5423) by @xudong963
  • Feature: Metasrv metrics (#5420) by @lichuang

Improvement

  • feat(test): Sql logic test framework improve (#5416) by @ZeaLoVe
  • refactor(query/config): Adapt RFC Config Backward Compatibility (#5409) by @Xuanwo
  • Improvement: remove meta client from session manager (#5411) by @junnplus

v0.7.49-nightly - May 16, 2022

What's Changed

Bug Fixes

  • fix(datatypes): fix datetime from negative micro timestamp bug (#5396) by @sundy-li
  • fixes(processor): fix server hang when processor panic (#5394) by @zhang2014

Build/Testing/CI

  • ci: MacOS's stateless standalone has been disabled (#5404) by @Xuanwo

Features

  • support tenant quota (#5406) by @junnplus
  • Feat(httphandler): result download (#5395) by @youngsofun
  • feat(metasrv): add metrics to http service (#5389) by @RinChanNOWWW

Improvement

  • improvement: add rows_limit for sort (#5403) by @xudong963
  • refactor(query/config/storage): Adapt RFC Config Backward Compatibility (#5399) by @Xuanwo
  • chore: delete redundancy code (#5400) by @xudong963
  • fix(function): map access support array and function (#5357) by @b41sh

v0.7.48-nightly - May 15, 2022

What's Changed

Bug Fixes

  • [meta] fix: query enables embedded meta only when meta.address is configured to be empty. Do not check endponit. Some user does not have endpoints in their old config (#5388) by @drmingdrmer
  • fixes(format): fixes incorrect rows size for csv stream load (#5383) by @zhang2014
  • bugfix(parser): t.a should be a column ref (#5370) by @andylokandy

Documentation

  • chore(doc): add window funnel to learn (#5387) by @BohuTANG
  • doc add http handler auth (#5385) by @wubx
  • chore(doc): windows funnel (#5386) by @BohuTANG
  • chore(doc): add user retention analysis to learn (#5384) by @BohuTANG
  • feat(function): Support connection_id function (#5381) by @b41sh

Features

  • feat(function): change retention return type from Variant to Array (#5302) by @fkuner
  • Feature: add more metrics in metasrv (#5376) by @lichuang
  • feat(function): Support connection_id function (#5381) by @b41sh

Others

  • Feature: add more metrics in metasrv (#5376) by @lichuang

v0.7.47-nightly - May 14, 2022

What's Changed

Bug Fixes

  • bugfix(executor): Fix wrong result of memory table engine (#5364) by @leiysky

Documentation

  • chore(doc): update readme (#5366) by @BohuTANG

Features

  • feat(planner): support subquery table reference type for new planner (#5279) by @xudong963
  • feature(planner): Support context function in new planner (#5369) by @leiysky

Improvement

  • chore(parser): use datavalues::IntervalType instead of DateTimeField (#5373) by @andylokandy
  • chore(functions): fix get function unnecessary double column loop (#5349) by @b41sh

Others

  • feature(planner): Support context function in new planner (#5369) by @leiysky

v0.7.46-nightly - May 13, 2022

What's Changed

Documentation

  • chore(doc): Add azblob config and title (#5361) by @BohuTANG
  • chore(doc): remove root from query toml (#5354) by @BohuTANG
  • chore(readme): add how to deploy azure blob stroage (#5348) by @BohuTANG
  • chore(doc): add azure blob storage deploy (#5347) by @BohuTANG
  • chore(doc): gnu -> musl (#5341) by @BohuTANG

Features

  • feature(planner): Support some scalar expressions in new planner (#5362) by @leiysky
  • feat(planner): support map access expression (#5358) by @andylokandy
  • RFC: Config Backward Compatibility (#5324) by @Xuanwo
  • Feature: add metrics in metasrv (#5208) by @lichuang

Improvement

  • chore: use x.to_string() instead of format!("{}", x) (#5359) by @andylokandy
  • improvement: avoid ub and delele unnecessary unsafe (#5338) by @xudong963
  • save one rpc to metasrv (#5345) by @TCeason
  • improvement(planner): Minor refactor of binder (#5339) by @leiysky
  • chore(functions): make array get return null if out of bounds happens (#5322) by @sundy-li

Others

  • feature(planner): Support some scalar expressions in new planner (#5362) by @leiysky
  • Feature: add metrics in metasrv (#5208) by @lichuang
  • improvement(planner): Minor refactor of binder (#5339) by @leiysky

v0.7.45-nightly - May 12, 2022

What's Changed

Bug Fixes

  • Temporarily delete todo codes to fix panic bug (#5321) by @ariesdevil
  • fix(storage/azblob): Azblob API uri not constructed correctly (#5316) by @Xuanwo
  • fix(planner): fix some cases in aggregator plan (#5307) by @xudong963

Documentation

  • chore(doc): musl -> gnu (#5313) by @BohuTANG

Features

  • feat(planner): support limit for new planner (#5301) by @fkuner
  • feat(parser): add span for expression (#5309) by @andylokandy

Improvement

  • add extra columns for show full tables (#5317) by @junnplus
  • chore(feature): refactor trait TypeDeserializer (#5312) by @sundy-li

v0.7.44-nightly - May 11, 2022

What's Changed

Build/Testing/CI

  • Revert "chore(query): introduce meta Runtime" (#5300) by @sundy-li
  • feat(function): Support generic Array access elements by index (#5244) by @b41sh
  • feat(function): support length function for Array & Array (#5274) by @fkuner

Documentation

  • feat(function): Support generic Array access elements by index (#5244) by @b41sh

Features

  • store endpoints to metasrv and use balance endpoints grpc connection channel (#4987) by @ariesdevil
  • impl alter database rename (#5286) by @TCeason
  • feat(function): Support generic Array access elements by index (#5244) by @b41sh
  • Feature: user api pb convert impl (#5296) by @lichuang
  • feat(function): support length function for Array & Array (#5274) by @fkuner
  • Feature: add user common types to pb impl (#5289) by @lichuang
  • feat(planner): display error with source span (#5290) by @andylokandy

Improvement

  • store endpoints to metasrv and use balance endpoints grpc connection channel (#4987) by @ariesdevil
  • chore(functions): make array-get returns non null datatype (#5306) by @sundy-li
  • refactor: Reuse StorageConfig in stage (#5280) by @Xuanwo
  • chore(query): introduce meta Runtime (#5294) by @sundy-li
  • chore(settings): remove unused UserSettings from meta/types (#5293) by @BohuTANG

Others

  • feat(function): Support generic Array access elements by index (#5244) by @b41sh
  • feat(function): support length function for Array & Array (#5274) by @fkuner

v0.7.43-nightly - May 10, 2022

What's Changed

Documentation

  • feat(doc): add Array(T) docs (#5266) by @b41sh
  • chore(doc): add query config (#5278) by @BohuTANG
  • docs: Update API key for algolia (#5273) by @Xuanwo
  • docs: Algolia enforce using databend-rs as prefix (#5272) by @Xuanwo

Features

  • feature(planner): Support subqueries in new planner (#5283) by @leiysky
  • Feature: multi-catalog (#4947) by @dantengsky
  • feat(format): support parquet input format (#5271) by @zhang2014
  • feat(planner): support order by in new planner (#5253) by @xudong963

Improvement

  • refine: of PR #4947 multi-catalog (#5284) by @dantengsky
  • feature(query): make expression serialized to raw sql (#5260) by @sundy-li
  • refine(planner): remove OrderExpr from Scalar (#5281) by @xudong963
  • Remove unauthenticated behavior (#5263) by @junnplus
  • refactor: Move configs to common so we can reuse it (#5270) by @Xuanwo
  • chore(planner): make code neat with DataField::new() (#5269) by @xudong963

Others

  • feature(planner): Support subqueries in new planner (#5283) by @leiysky

v0.7.42-nightly - May 09, 2022

What's Changed

Bug Fixes

  • feat(format): add scan progress values (#5262) by @zhang2014
  • bugfix(pipeline): Fix state machine of hash join (#5242) by @leiysky
  • fixes(format): fix string type csv truncate failure (#5243) by @zhang2014
  • fix(ast): improve helper message for error (#5239) by @andylokandy

Build/Testing/CI

  • feat(scripts/setup): Install jdk (#5255) by @Xuanwo

Documentation

  • chore(doc): refine the document (#5248) by @BohuTANG

Features

  • feat(planner): select without from (#5256) by @Veeupup
  • feat: Add HDFS support (#5245) by @Xuanwo

Improvement

  • refactor: Rename S3StageTable into StageTable (#5251) by @Xuanwo

Others

  • bugfix(pipeline): Fix state machine of hash join (#5242) by @leiysky

v0.7.41-nightly - May 08, 2022

What's Changed

Bug Fixes

  • fix(parser): show alternative tokens even if the branch is optional (#5230) by @andylokandy
  • bugfix(planner): Fix wrong result of hash join when join keys have different types (#5222) by @leiysky

Build/Testing/CI

  • feat(data type): ArrayType support inner dataType (#5049) by @b41sh

Documentation

  • chore(doc): add Continuous Benchmarking to menu (#5228) by @BohuTANG
  • feat(data type): ArrayType support inner dataType (#5049) by @b41sh

Features

  • feat(data type): ArrayType support inner dataType (#5049) by @b41sh

Improvement

  • feat(format): implement format trait (#5167) by @zhang2014
  • chore(meta): remove un-used warehouse codes (#5229) by @BohuTANG
  • chore: Make cargo clippy --all-targets happy (#5223) by @Xuanwo

Others

  • bugfix(planner): Fix wrong result of hash join when join keys have different types (#5222) by @leiysky

v0.7.40-nightly - May 07, 2022

What's Changed

Bug Fixes

  • fix: Handle exception display bug (#5218) by @Chasen-Zhang

Build/Testing/CI

  • feat(planner):integrate the stateless test for the new planner's aggregation (#5204) by @xudong963

Documentation

  • ISSUE-5170 remove stale config items (#5217) by @5kbpers

Features

  • feat(planner): Implement hash inner join (#5175) by @leiysky
  • feat: Introduce opendal 0.6 and enable retry support (#5216) by @Xuanwo
  • feat(planner):integrate the stateless test for the new planner's aggregation (#5204) by @xudong963
  • add access check for management mode (#5211) by @junnplus
  • feat(query): support timezone (#4878) by @Veeupup
  • feat(group_by): support two-level hashmap (#5075) by @fkuner

Improvement

  • ISSUE-5170 remove stale config items (#5217) by @5kbpers
  • feat(parser): display alternatives in error message (#5213) by @andylokandy
  • feat(group_by): support two-level hashmap (#5075) by @fkuner
  • Refactor reduce statistics logic (#5201) by @zhyass

Others

  • feat(planner): Implement hash inner join (#5175) by @leiysky
  • feat(group_by): support two-level hashmap (#5075) by @fkuner

Performance Improvement

  • feat(group_by): support two-level hashmap (#5075) by @fkuner

v0.7.39-nightly - May 06, 2022

What's Changed

Build/Testing/CI

  • feat(planner): support having and scalar expression in group by for new planner (#5200) by @xudong963

Features

  • feat(planner): support having and scalar expression in group by for new planner (#5200) by @xudong963
  • Add cluster key statistics in block meta (#5194) by @zhyass
  • Feature: impl get_table_by_id with kv-txn (#5185) by @lichuang
  • Feature: impl upsert_table_option with kv-txn (#5183) by @lichuang

Improvement

  • feat: Dynamically update the release version (#5184) by @Chasen-Zhang
  • [metasrv] refactor: remove specialized util methods, fix comments (#5182) by @drmingdrmer
  • fix roles is empty (#5176) by @junnplus

v0.7.38-nightly - May 05, 2022

What's Changed

Bug Fixes

  • fix(planner): make aggregator work and add simple stateless tests (#5165) by @xudong963

Build/Testing/CI

  • fix(planner): make aggregator work and add simple stateless tests (#5165) by @xudong963

Features

  • feat(fuse): add system$fuse_segment function (#5172) by @BohuTANG
  • feat(planner): Refine Scalar with enum_dispatch and support more scalar expressions (#5162) by @leiysky

Improvement

  • feat(common): checkout to official MutableBitmap (#5177) by @ygf11
  • Bump version of dependencies arrow2 and parquet2 (#5173) by @dantengsky
  • chore(warehouse): remove un-used warehouse codes (#5166) by @BohuTANG
  • chore(fuse): rename fuse_history -> fuse_snapshot (#5155) by @BohuTANG

Others

  • refactor(http_handler): remove /v1/statement.rs. (#5169) by @youngsofun
  • feat(planner): Refine Scalar with enum_dispatch and support more scalar expressions (#5162) by @leiysky

v0.7.37-nightly - May 04, 2022

What's Changed

v0.7.36-nightly - May 03, 2022

What's Changed

Bug Fixes

  • chore(base): disable backtrace by default (#5127) by @sundy-li
  • fix: fix trim (#5136) by @jiahui-97

Documentation

  • chore(doc): operator to index.md (#5144) by @BohuTANG
  • Chore(doc): IP Address upper case (#5143) by @BohuTANG
  • chore(doc): refine the function documentation (#5142) by @BohuTANG
  • chore(doc): refine the Readme (#5128) by @BohuTANG

Features

  • feat: Add scalar function humanize (#5073) by @cadl
  • feat(planner): Support TableFunction in new planner (#5135) by @leiysky
  • feat(planner): support more aggregate syntax (#5115) by @xudong963

Improvement

  • fix(backtrace): improve message when backtrace is disabled (#5141) by @sundy-li
  • improvement: add category for some functions (#4843) by @jiahui-97
  • chore(base): disable backtrace by default (#5127) by @sundy-li
  • refactor(processor): refactor insert into query for fuse engine (#5139) by @zhang2014
  • chore(query): remove some #[allow(dead_code)] (#5137) by @BohuTANG
  • chore(functions): snake_case name for cast function, toXX -> to_xx (#5126) by @sundy-li
  • improve(function): Specialize NumberFunction with input type (#5130) by @leiysky
  • improve(function): Specialize CastFunction with from type (#5124) by @leiysky
  • chore(QueryContext): make QueryContex as the first parameter (#5125) by @BohuTANG

Others

  • feat(planner): Support TableFunction in new planner (#5135) by @leiysky
  • improve(function): Specialize NumberFunction with input type (#5130) by @leiysky
  • improve(function): Specialize CastFunction with from type (#5124) by @leiysky

v0.7.35-nightly - May 01, 2022

What's Changed

Improvement

  • chore(datavalues): rename arc() to new_impl() (#5117) by @sundy-li
  • improvement(function): simplify in function (#5121) by @fkuner

Others

  • improvement(function): simplify in function (#5121) by @fkuner

v0.7.34-nightly - Apr 30, 2022

What's Changed

Bug Fixes

  • chore(planner): fix duplicate column name (#5112) by @sundy-li

Build/Testing/CI

  • feat(build): add thrift to dev setup (#5110) by @dantengsky

Documentation

  • chore(doc): add guides to README (#5118) by @BohuTANG
  • chore(doc): add how-to guides page (#5113) by @BohuTANG

Features

  • Introduce a helper ExpressionEvaluator to simplify expression evaluation (#5108) by @leiysky
  • feat(parser): support more statements (#5089) by @andylokandy

Improvement

  • chore(datatype): add BOOL alias (#5116) by @BohuTANG
  • feature(datablocks): add gather kernels for datablock (#5114) by @sundy-li

Others

  • Introduce a helper ExpressionEvaluator to simplify expression evaluation (#5108) by @leiysky

v0.7.33-nightly - Apr 29, 2022

What's Changed

Bug Fixes

  • fix: clickhouse worker hang when interpreter fail to execute (#5091) by @chowc
  • bug: fix interval_function flaky test (#5094) by @Veeupup
  • fix(functions): use drop guard to ensure the states dropped (#5097) by @sundy-li
  • fix(functions): manually drop state in function eval_aggr function (#5080) by @sundy-li

Build/Testing/CI

  • feat(data type): variant add alias json, object add alias map (#5099) by @b41sh

Dependency Updates

  • build(deps): bump cross-fetch from 3.1.4 to 3.1.5 in /website (#5098) by @dependabot[bot]

Documentation

  • chore(doc): deploy with QingCloud QingStore (#5103) by @BohuTANG

Features

  • feat(data type): variant add alias json, object add alias map (#5099) by @b41sh

Improvement

  • [common/meta] Refactor: DropDatabaseReq use DatabaseNameIdent to specify to db to delete (#5104) by @drmingdrmer
  • [common/meta] refactor: CreateDatabaseReq use DatabaseNameIdent to specify the db to craete (#5102) by @drmingdrmer
  • Improvement: support jwt verify without kid (#5101) by @junnplus
  • improvement: remove useless precision convert & remove tz in type timestamp (#5084) by @Veeupup
  • fix(functions): use drop guard to ensure the states dropped (#5097) by @sundy-li

Others

  • build(deps): bump cross-fetch from 3.1.4 to 3.1.5 in /website (#5098) by @dependabot[bot]

v0.7.32-nightly - Apr 28, 2022

What's Changed

Bug Fixes

  • fix doc parser (#5088) by @jiahui-97
  • hotfix(build): Revert "feat: add "instal_pgk thrift" to dev_setup.sh" (#5085) by @dantengsky
  • chore(query): manually drop the aggregate states to avoid memory leak (#5056) by @sundy-li

Build/Testing/CI

  • hotfix(build): Revert "feat: add "instal_pgk thrift" to dev_setup.sh" (#5085) by @dantengsky
  • build(dev_setup): add "install_pgk thrift" to dev_setup.sh (#5081) by @dantengsky

Documentation

  • chore(doc): Better title for metasrv and add backup/restore using mydumper (#5078) by @BohuTANG

Features

  • feat(planner): implement aggregate operator in new planner framework (#5027) by @xudong963
  • Feature: add leave node API (#5069) by @lichuang

Improvement

  • feat(plan): add enum-dispatch support to BasePlan to avoid downcast_ref (#5077) by @sundy-li
  • feat: using enum_dispatch to represent data_type(DataTypePtr -> DataTypeImpl) (#5063) by @PsiACE

Others

  • feat(httphandler): remove backtrace from response and log it. (#5082) by @youngsofun

Performance Improvement

  • feat: using enum_dispatch to represent data_type(DataTypePtr -> DataTypeImpl) (#5063) by @PsiACE

v0.7.31-nightly - Apr 27, 2022

What's Changed

Features

  • use jwtk for es512 (#5062) by @junnplus

Improvement

  • feat: date/timestamp bound check (#5054) by @Veeupup
  • chore(mysql): supper mydumper/myloader to dump and load databend schema (#5068) by @BohuTANG

Others

  • feat(query_log): record parse errors. (#5070) by @youngsofun

v0.7.30-nightly - Apr 26, 2022

What's Changed

Bug Fixes

  • fix(parser): allow to omit semicolon (#5058) by @andylokandy

Documentation

  • chore(test): move some select tests from dummy test and fix the datetime type in doc (#5055) by @BohuTANG

Features

  • feat: memory profiling (#5050) by @dantengsky
  • feat(planner): Support select operator in new planner framework (#5059) by @leiysky
  • Feature: transaction api (#5030) by @lichuang

Improvement

  • chore(storage): Cleanup the uncommitted files generated during OCC (#5061) by @dantengsky
  • fix(parser): allow to omit semicolon (#5058) by @andylokandy
  • chore(test): move some select tests from dummy test and fix the datetime type in doc (#5055) by @BohuTANG

Information - Updated May 25, 2022

Stars: 4.0K
Forks: 404
Issues: 369

Rust bindings for libinjection

Add libinjection to dependencies of Cargo

Rust bindings for libinjection

Rust bindings for the C++ api of PyTorch

LIghtweight wrapper for pytorch eg libtorch in rust

Rust bindings for the C++ api of PyTorch

Rust leveldb bindings

Almost-complete bindings for leveldb for Rust

Rust leveldb bindings

rust-analyzer is a modular compiler frontend for the Rust language

It also contains some tips &amp; tricks to help you be more productive when using rust-analyzer

rust-analyzer is a modular compiler frontend for the Rust language

Rust-Lightning is a Bitcoin Lightning library written in Rust

lightning, does not handle networking, persistence, or any other I/O

Rust-Lightning is a Bitcoin Lightning library written in Rust

Rust Native Storage Library

Creates an executable dynamic library at the following location

Rust Native Storage Library

Forest is an implementation of Filecoin written in Rust

Forest is an implementation of Filecoin Protocol Specification, specifically the virtual machine, blockchain, and node system, and (ii) integrating functional components for storage mining and...

Forest is an implementation of Filecoin written in Rust

Carmen-core is a backend storage library for Carmen written in Rust

Carmen-core is a backend storage library for

Carmen-core is a backend storage library for Carmen written in Rust

rust implementation of the Encoding for Robust Immutable Storage (ERIS) spec draft

rust implementation of the Encoding for Robust Immutable Storage (ERIS)

rust implementation of the Encoding for Robust Immutable Storage (ERIS) spec draft
Facebook Instagram Twitter GitHub Dribbble
Privacy