116 lines
3.2 KiB
Markdown
116 lines
3.2 KiB
Markdown
# query-interpreter
|
|
|
|
**README LAST UPDATED: 04-24-25**
|
|
|
|
This project is under active development and is subject to change often and drastically as I am likely an idiot.
|
|
|
|
Core program to interpret query language strings into structured data, and back again.
|
|
|
|
## Data Structure Philosophy
|
|
|
|
We are operating off of the philosophy that the first class data is SQL Statement stings.
|
|
|
|
From these strings we derive all structured data types to represent those SQL statements.
|
|
Whether it be CRUD or schema operations.
|
|
|
|
Our all of these structs will have to implement the `Query` interface
|
|
|
|
```go
|
|
type Query interface {
|
|
GetFullSql() string
|
|
}
|
|
```
|
|
|
|
So ever struct we create from SQL will need to be able to provide a full and valid SQL
|
|
statement of itself.
|
|
|
|
These structs are then where we are able to alter their fields programatically to create
|
|
new statements altogether.
|
|
|
|
|
|
## SQL Tokens
|
|
|
|
We are currently using DataDog's SQL Tokenizer `sqllexer` to scan through SQL strings.
|
|
The general token types it defines can be found [here](/docs/SQL_Token_Types.md)
|
|
|
|
|
|
These are an OK generalizer to start with when trying to parse out SQL, but can not be used
|
|
without some extra conditional logic that checks what the actual values are.
|
|
|
|
Currently we scan through the strings to tokenize it. When stepping through the tokens we try
|
|
to determine the type of query we are working with. At that point we assume the over all structure
|
|
of the rest of the of the statement to fit a particular format, then parse out the details of
|
|
the statement into the struct correlating to its data type.
|
|
|
|
## Scan State
|
|
|
|
As stated, we scan through the strings, processing each each chunk, delineated by spaces and
|
|
punctuation, as a token. To properly interpret the tokens from their broad `token.Type`s, we
|
|
have to keep state of what else we have processed so far.
|
|
|
|
This state is determined by a set off flags depending on query type.
|
|
|
|
For example, a Select query will have:
|
|
```go
|
|
passedSELECT := false
|
|
passedColumns := false
|
|
passedFROM := false
|
|
passedTable := false
|
|
passedWHERE := false
|
|
passedConditionals := false
|
|
passedOrderByKeywords := false
|
|
passesOrderByColumns := false
|
|
```
|
|
|
|
The general philosophy for these flags is to name, and use, them in the context of what has
|
|
already been processed through the scan. Making naming and reading new flags trivial.
|
|
|
|
A `Select` object is shaped as the following:
|
|
```go
|
|
type Select struct {
|
|
Table string
|
|
Columns []Column
|
|
Conditionals []Conditional
|
|
OrderBys []OrderBy
|
|
Joins []Join
|
|
IsWildcard bool
|
|
IsDistinct bool
|
|
}
|
|
|
|
type Column struct {
|
|
Name string
|
|
Alias string
|
|
AggregateFunction AggregateFunctionType
|
|
}
|
|
|
|
type AggregateFunctionType int
|
|
const (
|
|
MIN AggregateFunctionType = iota + 1
|
|
MAX
|
|
COUNT
|
|
SUM
|
|
AVG
|
|
)
|
|
|
|
//dependency in query.go
|
|
type Conditional struct {
|
|
Key string
|
|
Operator string
|
|
Value string
|
|
DataType string
|
|
Extension string // AND, OR, etc
|
|
}
|
|
|
|
|
|
type OrderBy struct {
|
|
Key string
|
|
IsDescend bool // SQL queries with no ASC|DESC on their ORDER BY are ASC by default, hence why this bool for the opposite
|
|
}
|
|
```
|
|
|
|
|
|
## Improvement Possibilities
|
|
|
|
- Maybe utilize the `lookBehindBuffer` more to cut down the number of state flags in the scans?
|
|
|