JSON parser which picks up values directly without performing tokenization in Rust

This JSON parser is implemented based on an abstract that utilizes in memory indexing and parsing

Pikkr

JSON parser which picks up values directly without performing tokenization in Rust

Abstract

Pikkr is a JSON parser which picks up values directly without performing tokenization in Rust. This JSON parser is implemented based on Y. Li, N. R. Katsipoulakis, B. Chandramouli, J. Goldstein, and D. Kossmann. Mison: a fast JSON parser for data analytics. In VLDB, 2017.

This JSON parser extracts values from a JSON record without using finite state machines (FSMs) and performing tokenization. It parses JSON records in the following procedures:

  1. [Indexing] Creates an index which maps logical locations of queried fields to their physical locations by using SIMD instructions and bit manipulation.
  2. [Basic parsing] Finds values of queried fields by scanning a JSON record using the index created in the previous process and learns their logical locations (i.e. pattern of the JSON structure) in the early stages.
  3. [Speculative parsing] Speculates logical locations of queried fields by using the learned result information, jumps directly to their physical locations and extracts values in the later stages. Fallbacks to basic parsing if the speculation fails.

This JSON parser performs well when there are a limited number of different JSON structural variants in a JSON data stream or JSON collection, and that is a common case in data analytics field.

Please read the paper mentioned in the opening paragraph for the details of the JSON parsing algorithm.

Performance

Benchmark Result

Hardware

Rust

Crates

  • serde_json 1.0.3
  • json 0.11.9
  • pikkr 0.16.0

JSON Data

  • "a JSON data set of startup company information" on JSON Data Sets | JSON Studio.

Benchmark Code

  • pikkr/rust-json-parser-benchmark: Rust JSON Parser Benchmark

Example

Code

Build

Run

Documentation

  • pikkr - Rust

Restrictions

  • Rust nightly channel and CPUs with AVX2 are needed to build Rust source code which depends on Pikkr and run the executable binary file because Pikkr uses AVX2 Instructions.

Contributing

Any kind of contribution (e.g. comment, suggestion, question, bug report and pull request) is welcome.

Issues

Collection of the latest Issues

daniel-ferguson

daniel-ferguson

1

I'm not 100% sure whether this crate is supposed to support accessing fields of objects within arrays but I'm seeing strange behaviour when doing so...

Given a JSON record:

The results of the following queries are:


Or as a runnable example (modelled on the example from the readme):

dtolnay

dtolnay

3

What guarantees does this library make about rejecting invalid JSON and returning valid JSON?

In particular, the following program accepts the input which is not valid JSON and returns b"0," as output which is not valid JSON.

Versions

Find the latest versions by id

Information - Updated Jun 15, 2022

Stars: 591
Forks: 12
Issues: 5

Repositories & Extras

Serde is a framework for serializing and deserializing Rust data structures efficiently and generically

Rust Greatest JSON weapon is Serde with over 4.4K stars on github and a massive developer community. This is considered a core Rust library for every developer to learn in BRC's opinion

Serde is a framework for serializing and deserializing Rust data structures efficiently and generically

Rust 버전 JsonPath 구현으로 Webassembly와 Javascript에서도 유사한 API 인터페이스를 제공 한다

JsonPath 구현으로 Webassembly와 Javascript에서도 유사한 API 인터페이스를 제공 한다

Rust 버전 JsonPath 구현으로 Webassembly와 Javascript에서도 유사한 API 인터페이스를 제공 한다

SIMD JSON for Rust  

Rust port of extremely fast serde compatibility

SIMD JSON for Rust  

JSON-E Rust data-struct paramter crate for lightweight embedded content with objects and much more

What makes JSON-e unique is that it extensive documentation and ease of use

JSON-E Rust data-struct paramter crate for lightweight embedded content with objects and much more
JSON

111

A Rust JSON5 serializer and deserializer which speaks Serde

Deserialize a JSON5 string with from_str

A Rust JSON5 serializer and deserializer which speaks Serde

Rust JSON Parser Benchmark

Download and Generate JSON Data

Rust JSON Parser Benchmark

Read JSON values quickly - Rust JSON Parser

AJSON get json value with specified path, such as project

Read JSON values quickly - Rust JSON Parser

Command line json text parsing and processing utility

parsing json compliant with rust and cargo

Command line json text parsing and processing utility

Rust actix json request example

Send a json request to actix, and parse it

Rust actix json request example

Why yet another JSON package in Rust ?

======================================

Why yet another JSON package in Rust ?
JSON

140

json_typegen - Rust types from JSON samples

json_typegen is a collection of tools for generating types from

json_typegen - Rust types from JSON samples

Rust JSON parsing benchmarks

This project aims to provide benchmarks to show how various JSON-parsing libraries in the Rust programming language perform at various JSON-parsing tasks

Rust JSON parsing benchmarks
Facebook Instagram Twitter GitHub Dribbble
Privacy