Skip to content

Releases: bullet-db/bullet-core

Rate limiting for BufferingSubscriber, New functions: EXPLODE, LATERAL VIEW, NOT RLIKE, NOT RLIKE ANY, TRIM, ABS, BETWEEN, NOT BETWEEN, SUBSTRING, UNIXTIMESTAMP

05 May 19:47

Choose a tag to compare

This release adds a bunch of new functions to be supported through BQL:

EXPLODE - for exploding maps and lists into constituents in the SELECT or with a LATERAL VIEW
LATERAL VIEW - to be used with EXPLODE to generate a cross product with the exploded field and a row for all rows in a result
NOT RLIKE, NOT RLIKE ANY - the NOT versions of the supported RLIKE and RLIKE ANY
BETWEEN and NOT BETWEEN - two functions to check if a numeric or a string typed value is between two other values or fields
SUBSTRING - to get parts of a String with support for negative indexing
UNIXTIMESTAMP - to get the current UTC unix timestamp or to convert a given date argument or field into a unix timestamp with an optional pattern for parsing the date

It also adds rate limiting capabilities to the com.yahoo.bullet.pubsub.BufferingSubscriber so that the various PubSub usages of them can support rate limiting if needed. A new constructor that enables rate limiting and takes arguments for the number of messages per given time interval has been added to this class.

First release using Screwdriver

27 Apr 19:35

Choose a tag to compare

First release on Maven Central - Bintray EOL

23 Apr 00:44

Choose a tag to compare

Yaml loading from jvyaml to snakeyaml

24 Mar 23:14

Choose a tag to compare

This release updates how Yaml is loaded in BulletConfig. We switched the library from jvyaml to snakeyaml. This is largely to allow parsing empty strings in values and support a more up-to-date yaml specification.

Storage layer updates

04 Jan 17:47

Choose a tag to compare

This release revamps the Storage layer to make it more adaptable to different storages. It breaks interface but since the storage layer was in pre-release and not used anywhere, this is just a minor version update for bullet-core. The various classes can be found under com.yahoo.bullet.core.storage.

  1. The StorageManager has different interfaces to work with byte[], String and now takes a type bound to get implementation specific objects.
  2. StorageManager now supports working with namespaces (to encapsulate multiple logical units of storage - think tables) and partitions per namespace. It supports a new Criteria interface to query and modify arbitrary storages that need interfaces beyond what the basic StorageManager provides.
  3. Three reference implementations:
    3.1 NullStorageManager - use if no storage is needed.
    3.2 MemoryStorageManager - unpartitioned, namespace-less implementation for storing everything in memory
    3.3 MultiMemoryStorageManager - supports partitions and namespaces. An example criteria for counting MultiMemoryCountingCriteria is provided
  4. A new StorageConfig class for storages similar to PubSubConfig for PubSubs

Ternary Logic

30 Oct 23:31

Choose a tag to compare

This release migrates to using Bullet Record 1.1.0 that support ternary logic. All boolean operations now work with a ternary or 3-value logic system with NULL as the third value. This follows the standard SQL semantics when comparing or operating on NULLs. There should be no interface changes but behaviors will be different when operating on NULLs.

First Major Release - Expressions, Storage, Async queries, no more JSON!

02 Oct 18:40

Choose a tag to compare

Bullet 1.0 is here! This release marks the first major version of Bullet.

The two high level changes that are happening with this release are:

  1. No more JSON queries. Until now, queries were mainly parsed from JSON and were sent from web service to backend as JSON strings where they would be de-serialized, configured, and initialized (and validated). In this release, we move from using JSON serialization to constructing and sending query objects directly. (BQL will be the main method of query construction going from here.)

  2. Expressions. Expressions enable first-order logic in virtually all parts of the Bullet query (besides Aggregations which still use field names). This allows us to filter on and project - and consequently aggregate on - more than just individual fields.

PubSub

  1. PubSubMessage
    1.1. Content changed to byte[] from String; added getContentAsString() which replaces the old getContent()
    1.2. Sequence number removed since it was not used
  2. Metadata
    2.1. Creation timestamp added. This is used in Querier as the query start time.
    2.2. Metadata copy() added. This is expected to be overridden by other PubSubs. e.g. in the subclass RESTMetadata, copy() returns a RESTMetadata.
  3. Publisher
    3.1. PubSubMessage send(PubSubMessage), PubSubMessage send(String, String) in now return the sent PubSubMessage for any configured StorageManagers since the message may be modified. (Previously returned void)

New Interfaces

  1. PubSubResponder abstract class added and extended by any class that responds to a PubSubMessage.
    1.1. PubSubResponder is used in Bullet Service 1.0.0 both synchronously and asynchronously.
    1.2. BulletPubSubResponder implementation provided which publishes results to a configured PubSub
  2. Added StorageManager abstract class which is used to store and retrieve PubSubMessages. Primarily used in Bullet Service for persistent queries (reference implementations landing soon!) sending queries and receiving results.
    2.1. MemoryStorageManager implementation that stores objects in memory
    2.2. NullStorageManager implementation that stores nothing (in the event you do not want to use a StorageManager)

Metrics (New)

  1. Added MetricEvent which is a simple wrapper that represents a metric event.
  2. Added MetricCollector which is a utility class for storing frequency and average metrics for string keys.
  3. Added MetricPublisher abstract class
    3.1. MetricEventPublisher abstract class that extends MetricPublisher to publish MetricEvents.
    3.2. HTTPMetricEventPublisher implementation of MetricEventPublisher which publishes MetricEvents to a given URL and can retry multiple times.

Expressions (New)

  1. Added Expressions and Evaluators. Expressions enable first-order logic in queries, and evaluators are constructed from expressions and evaluated on Bullet records.
    1.1. Supported expressions: Field, Value, List, Unary, Binary, NAry, and Cast
    1.2. Supported operations:
    • Arithmetic: +, -, *, /
    • Comparators: =, !=, >, <, >=, <= (with ANY/ALL for value-to-list comparisons)
    • Boolean logic: AND, OR, XOR
    • Unary: NOT, SIZEOF, IS [NOT] NULL
    • Binary: CONTAINSKEY, CONTAINVALUE, SIZEIS
    • If-then-else: IF
    • Regex LIKE: RLIKE

Query and Querier

  1. Projection and Filter now use expressions.
  2. Projection now explicitly supports three projection types: COPY, NO_COPY, and PASS_THROUGH denoting how fields should be projected.
    2.1. COPY - the record is copied before new fields are projected, e.g. in the computation post-aggregation.
    2.2. NO_COPY - fields are projected onto an empty record
    2.3. PASS_THROUGH - the original record is passed on with no projection
  3. Computation and OrderBy post-aggregations now use expressions.
  4. Added Having post-aggregation which is a filter applied after aggregation.
  5. Added Culling post-aggregation which removes specified fields. This takes the place of the implicit transient fields previously generated in Querier.
  6. Querier now takes a Query object. Consequently, there is no initialization or error-checking done in the Querier anymore.
  7. The query start time is now taken from metadata rather than set at Querier initialization; therefore, the query start time is set when the query is initially sent.
  8. Querier result metadata now includes the original query string in addition to the query object’s JSON.
  9. Queries are now error-checked in the constructors whereas previously, queries were initialized and validated after construction.
  10. Added Aggregation subclasses for the different types of aggregations.
  11. Renamed previous “parsing” package to “query” package since we no longer parse queries from JSON.
  12. Aggregation operations and strategies moved to “querying” package.

Miscellaneous

  1. Added a constructor BulletError(String, String)
  2. BulletException is now a RuntimeException and also takes a single BulletError rather than a list of errors.
  3. Removed Initializable interface
  4. Some strategies renamed
    4.1. Raw to RawStrategy
    4.2. TopK to FrequentItemsSketchingStrategy
    4.3. CountDistinct to ThetaSketchingStrategy
    4.4. GroupBy to TupleSketchingStrategy
  5. ThetaSketch now rounds the resulting count
  6. Updated SimpleEqualityPartitioner to work with Expressions and TypedObjects
  7. Fixed a bug in SimpleEqualityPartitioner where the partitioner did not differentiate properly between missing fields and null fields.
  8. Typesystem moved to Bullet Record 1.0.0
  9. The default BulletRecordProvider class is now “com.yahoo.bullet.record.avro.TypedAvroBulletRecordProvider” from “com.yahoo.bullet.record.AvroBulletRecordProvider”
  10. Can now get a configured Bullet Record Schema from BulletConfig.

QueryManager partition cleanup

02 Feb 02:44

Choose a tag to compare

This release fixes a bug in the QueryManager, where a partition, once empty - all queries in it expired, would not be cleaned up. This was a leak and could eventually lead to unbounded growth of the partition key space. This could be seen in the stats where the PARTITION_COUNT stat would just monotonically increase.

QueryManager Logging Fixes

21 Dec 02:19

Choose a tag to compare

The logging in QueryManager was broken and did not print values for variables. This is now fixed.

Extended nested notation support for Projections

22 Nov 01:00

Choose a tag to compare

This updates Bullet Record to 0.3.0 and uses the forceSet to support #47 for Projections as well. Since forceSet can potentially be problematic, the Querier now handles the exception instead of crashing.

All queries can now include the full dot separated notation to access whichever field in a BulletRecord (all types supported). However, operations that only work on primitives (no equals on lists for example), still do.