Skip to content

Commit

Permalink
Cleanups
Browse files Browse the repository at this point in the history
  • Loading branch information
jimsmart committed Apr 14, 2021
1 parent 56b3fe3 commit 7d97b77
Show file tree
Hide file tree
Showing 9 changed files with 83 additions and 84 deletions.
42 changes: 15 additions & 27 deletions TODO
Original file line number Diff line number Diff line change
@@ -1,13 +1,23 @@
TODO
====

- Document pk stuff.

- Write thorough tests for each non-trivial writer.

- Unify the tests:
- Shared structs
- Individual constructor/setup for each writer
- Shared test methods
- Individual verification methods for each writer
- Could include checking output file exists / doesn't exist

- Writers currently process all fields. We should ignore those without appropriate tags.
- I'm pretty sure this has been done, but we should ensure it is covered by a test.


- Write thorough tests for each non-trivial writer.
- SQLiteWriter
- ExcelWriter

- Support more types (currently only string and int)


- Writers
Expand All @@ -16,10 +26,10 @@ TODO


- Logging for CSV and Excel writers on completion, detailing files written?
Should be exposed as scrutiny metadata maybe?
- Should be exposed as scrutiny metadata maybe?


- Switch existing reflect stuff in Write to use a reflect walker.
- Switch existing reflect stuff in Write to use reflectwalk.
- Actually, I think this is overkill, we process flat structs.
- On second thoughts, it may simplify some of our code.

Expand All @@ -30,25 +40,3 @@ New writer types

?





Done
====

- Metadata reflector should walk structs.
- Metadata reflector should handle more types.
- Some kind of third-party reflection walker package? - reflectwalk

- Idea: structs could have a PrepareMetadata() method, which is called just before metadata is scraped from the struct (to facilitate late computation of metadata values)


JSONStreamWriter - Done, see JSONLWriter.
- See https://en.wikipedia.org/wiki/JSON_streaming
- To futz about with tags/etc, see https://stackoverflow.com/questions/42546519/how-do-i-dynamically-change-the-structs-json-tag

- Arguably, the metadata stuff doesn't belong in this package? - seperation of concerns, and that.
- Done: See package scrutiny.

SQLiteWriter - Done
9 changes: 0 additions & 9 deletions base_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,11 @@ import (
)

type base struct {
records []interface{} // records is a list of an instance of each struct type.
types []reflect.Type // types is a list of reflected types for each struct type.
headersByType map[reflect.Type][]string // headersByType is a list of headers for each struct type.
typesByType map[reflect.Type][]reflect.Type // typesByType is a list of reflected field types for each struct type.
tagsByType map[reflect.Type][]string // tagsByType is a list of field tags for each struct type.
}

// TODO(js) We have some redundancy. We can likely get rid of most/all of the lists in base?

// TODO What is records really used for? Logging? Metadata?

// register a type and collect its metadata.
// If the type is a newly registered type
// (has not been seen before),
Expand All @@ -35,9 +29,6 @@ func (w *base) register(x interface{}) (reflect.Type, bool) {
return t, false
}

w.records = append(w.records, x)
w.types = append(w.types, t)

var headers []string
var types []reflect.Type
var tags []string
Expand Down
13 changes: 11 additions & 2 deletions doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@
// Its primary purpose is to provide a single consistent interface
// for easy, ceremony-free persistence of struct data.
//
// Currently supported formats are SQLite, JSON Lines, CSV/TSV and Excel.
// Each distinct struct type sent to Write() is written to an individual file/table,
// each named according to the name of the struct.
//
// Currently supported formats are SQLite, JSON Lines (JSONL), CSV/TSV, and Excel.
// Additional writers are also provided to assist with testing and debugging.
// Mutiple writers can be combined using MultiWriter.
//
Expand Down Expand Up @@ -60,9 +63,15 @@
//
// When successfully completed:
// err = w.Close()
// // Handle error...
//
// // Output files will be:
// // - /some/path/my-data-ParentRecord.csv
// // - /some/path/my-data-ChildRecord.csv
//
// Or, to abort the whole operation in the event of an error or cancellation:
// Or, to abort the whole operation in the event of an error or cancellation while writing records:
// err = w.Cancel()
// // Handle error...
//
// MultiWriter
//
Expand Down
20 changes: 0 additions & 20 deletions jsonl_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,12 @@ package peanut
import (
"bufio"
"encoding/json"
"fmt"
"io/ioutil"
"log"
"os"
"reflect"
)

// TODO(js) Should this be called JSONLinesWriter?

var _ Writer = &JSONLWriter{}

// JSONLWriter writes records to JSON Lines files, writing
Expand Down Expand Up @@ -102,23 +99,6 @@ func (w *JSONLWriter) Write(x interface{}) error {
return enc.Encode(mapValues(x))
}

func mapValues(x interface{}) map[string]interface{} {
out := make(map[string]interface{})
reflectStructValues(x, func(name string, t reflect.Type, v interface{}, tag string) {
tag = firstTagValue(tag)
switch t.Kind() {
case reflect.String:
out[tag] = v.(string)
case reflect.Int:
out[tag] = v.(int)
default:
m := fmt.Sprintf("Unknown type: %v", v)
panic(m)
}
})
return out
}

// Close flushes all buffers and writers,
// and closes the output files.
func (w *JSONLWriter) Close() error {
Expand Down
17 changes: 0 additions & 17 deletions log_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ import (
"log"
"os"
"reflect"
"strconv"
)

// TODO(js) This could have a multi-line mode, where each field is logged to its own line.
Expand Down Expand Up @@ -71,19 +70,3 @@ func (w *LogWriter) Cancel() error {
}
return nil
}

func stringValues(x interface{}) []string {
var out []string
reflectStructValues(x, func(name string, t reflect.Type, v interface{}, tag string) {
switch t.Kind() {
case reflect.String:
out = append(out, v.(string))
case reflect.Int:
out = append(out, strconv.Itoa(v.(int)))
default:
m := fmt.Sprintf("Unknown type: %v", v) // TODO(js) This would be clearer if it used t.Name() ?
panic(m)
}
})
return out
}
5 changes: 3 additions & 2 deletions log_writer_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@ var _ = Describe("LogWriter", func() {
It("should write the correct text when structs are written", func() {

type Foo struct {
StringField string `peanut:"test_string1"`
IntField int `peanut:"test_int1"`
StringField string `peanut:"test_string1"`
IntField int `peanut:"test_int1"`
IgnoredField int
}

testOutput := []*Foo{
Expand Down
36 changes: 35 additions & 1 deletion peanut.go
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
package peanut

import (
"fmt"
"reflect"
"strconv"
"strings"
"unicode"
"unicode/utf8"
)

// Writer defines a record-based writer.
type Writer interface {
// Write is called to persist records.
Write(r interface{}) error
Close() error
Cancel() error
Expand All @@ -19,6 +20,39 @@ type Writer interface {

const tagName = "peanut"

func stringValues(x interface{}) []string {
var out []string
reflectStructValues(x, func(name string, t reflect.Type, v interface{}, tag string) {
switch t.Kind() {
case reflect.String:
out = append(out, v.(string))
case reflect.Int:
out = append(out, strconv.Itoa(v.(int)))
default:
m := fmt.Sprintf("Unknown type: %v", v) // TODO(js) This would be clearer if it used t.Name() ?
panic(m)
}
})
return out
}

func mapValues(x interface{}) map[string]interface{} {
out := make(map[string]interface{})
reflectStructValues(x, func(name string, t reflect.Type, v interface{}, tag string) {
tag = firstTagValue(tag)
switch t.Kind() {
case reflect.String:
out[tag] = v.(string)
case reflect.Int:
out[tag] = v.(int)
default:
m := fmt.Sprintf("Unknown type: %v", v)
panic(m)
}
})
return out
}

func reflectStructFields(x interface{}, fn func(name string, t reflect.Type, tag string)) {

// TODO(js) This should work with Ptr and non-Ptr.
Expand Down
7 changes: 4 additions & 3 deletions rand_filename.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package peanut

import (
Expand All @@ -14,6 +11,10 @@ import (

// This is copied from https://golang.org/src/io/ioutil/tempfile.go

// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

// Random number state.
// We generate random temporary file names so that there's a good
// chance the file doesn't exist yet - keeps the number of tries in
Expand Down
18 changes: 15 additions & 3 deletions sqlite_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@ import (

// TODO(js) We should not be silently overwriting things. We should return an error somehow.

// TODO(js) Document primary key creation / accompanying ",pk" tag notation.

// SQLiteWriter writes records to an SQLite database,
// writing each record type to an individual table
// automatically.
Expand All @@ -42,6 +40,20 @@ import (
// Note that SQLiteWriter currently only handles
// string and int types:
// strings are output as TEXT, and ints as INTEGER.
//
// SQLiteWriter supports additional tag values to denote the primary key:
// type ParentRecord struct {
// ParentID string `peanut:"parent_id,pk"`
// Name string `peanut:"name"`
// Counter int `peanut:"counter"`
// }
//
// type ChildRecord struct {
// ChildID string `peanut:"child_id,pk"`
// Name string `peanut:"name"`
// ParentID string `peanut:"parent_id"`
// }
// Compound primary keys are also supported.
type SQLiteWriter struct {
*base
tmpFilename string // tmpFilename is the filename used by the temp file.
Expand All @@ -50,7 +62,7 @@ type SQLiteWriter struct {
db *sql.DB // db is the database instance.
}

// TODO Can we unify/simplify the constructors? Use pattern instead of prefix/suffix maybe? (not here, but for others)
// TODO(js) Can we unify/simplify the constructors? Use pattern instead of prefix/suffix maybe? (not here, but for others)

// NewSQLiteWriter returns a new SQLiteWriter,
// using the given filename + ".sqlite" as its final output location.
Expand Down

0 comments on commit 7d97b77

Please sign in to comment.