Skip to content

Memory issue with large queries #840

Open
@alex-statsig

Description

@alex-statsig

Describe the bug
When formatting a large query containing thousands of ids (5,000-10,000+), the formatter uses a very large amount of memory, often OOMing the process.

Expected behavior
While this is a rather large query to deal with, many query engines can handle it fine. The memory consumption to format a <100kB query should not consume GBs of RAM

Actual behavior
The formatting function consumed RAM of nearly 50,000x the original query length

Usage

Here's an example script which generates two similar types of queries which both run into the issue:

const values = Array(10000).fill('myid')

// Large "in" clause
const sql = `SELECT * FROM my_table WHERE col in (${values.map((v) => "'" + v + "'").join(",")})`
// Large number of "or" clauses
// const sql = `SELECT * FROM my_table WHERE (${values.map((v) => "col = '" + v + "'").join(" OR ")})`
const {format} = require('sql-formatter')
const formatted = format(sql, {language: 'sql'})
console.log(Math.round(process.memoryUsage().heapUsed / 1024 / 1024) + " MB")

The first query uses 1.5GB of memory, and the second OOMs with 4GB of memory

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions