Abhinav Gupta | About

How to parse Newline Delimited JSON in Go

Table of Contents

1. Introduction

Newline Delimited JSON (ndjson) is a format to represent a stream of structured objects. A stream in this format takes the following form.

{"base": "white rice", "proteins": ["tofu"]}
{"base": "salad", "proteins": ["tuna", "salmon"]}

We can parse these with Go’s encoding/json package.

2. Parsing JSON

Before delving into Newline Delimited JSON, let’s reiterate over how to parse a single JSON value with encoding/json.

Consider the Order type.

type Order struct {
    Base     string   `json:"base"`
    Proteins []string `json:"proteins"`
}

To parse a JSON payload into an Order object, we use json.Unmarshal.

var order Order
if err := json.Unmarshal(data, &order); err != nil {
    return fmt.Errorf("parse order: %w", err)
}

fmt.Println(order)

2.1. Using json.Decoder

The encoding/json package exports json.Decoder which provides more control over JSON parsing. Part of its interface is shown below.

type Decoder
    func NewDecoder(io.Reader) *Decoder
    func (*Decoder) Decode(interface{}) error

The following use of json.NewDecoder to parse a single JSON value is roughly equivalent to the prior use of json.Unmarshal.

var order Order
if err := json.NewDecoder(src).Decode(&order); err != nil {
    return fmt.Errorf("parse order: %w", err)
}

fmt.Println(order)

It’s roughly equivalent with one significant difference: the input is now an io.Reader instead of a []byte.

json.Unmarshal(data, &order)            // data is a []byte
json.NewDecoder(src).Decode(&order)     // src is an io.Reader

3. Parsing Newline Delimited JSON

json.Decoder can parse a JSON value from an io.Reader without reading its entire contents.

Additionally, if the same decoder is read from multiple times, it parses consecutive JSON values from the io.Reader.

decoder := json.NewDecoder(src)

var o1 Order
if err := decoder.Decode(&o1); err != nil {
    return err
}

var o2 Order
if err := decoder.Decode(&o2); err != nil {
    return err
}

// ...

This is helpful for ndjson, but it’s unclear when we should stop reading. To help with that, json.Decoder includes the More() bool method, which reports whether there’s more JSON input available on the io.Reader.

type Decoder
    func (*Decoder) More() bool

With its help, we can use json.Decoder to parse Newline Delimited JSON like so.

decoder := json.NewDecoder(src)
for decoder.More() {
    var order Order
    if err := decoder.Decode(&order); err != nil {
        return fmt.Errorf("parse order: %w", err)
    }

    fmt.Println(order)
}

Written on .