Skip to content

strategy: market data research #55

Open
@zhehaowang

Description

@zhehaowang

What are some general characteristics of this market (in particular the venue du)?
Given our non-trivial holding time, we should devise our strategy according to key observations we have on our data.

We have observed:

Very few (model, size) trade regularly

Most can be characterized as having large spread, low volume and liquidity. Missing data on one side is often expected. It would be extremely risky to get into illiquid positions.

Strategy report from the most complete feed run suggests

total (style_id, size) pairs 34887
total (style_id, size) pairs 8725 with data
total (style_id, size) pairs 8725 with fresh data
total (style_id, size) pairs 4753 with fresh transactions
total (style_id, size) pairs 1820 satisfying profit cutoff ratio (bid to last) of 0.01

We can consider trading about 13.6% of the scraped space.

High volatility and price spikes are to be expected, even in most heavily traded pairs

If we sort by size (hence roughly number of transactions) (apparently we've too many files so just a ls -Sl wouldn't do)

find . -name "*.json" -exec ls -l {} + | tr -s ' ' | cut -d' ' -f 5,9 | sort -s -n -k 1,1 | tail

And look at some of our most liquid pairs:

./du_analyzer.py --style_id 554723-051 --size 7.0 --mode plot

Figure_1

The decision the current strategy would derive around 20191014 would be drastically different from any other time.
I cannot yet come up with an explanation for the spikes.

Another more extreme example (a 2017 valentine's day issue):

./du_analyzer.py --style_id 881426-009 --size 7.0 --mode plot

Figure_1

./du_analyzer.py --style_id 881426-009 --size 7.0 --mode stats
        First Date:       2019-07-31T06:49:14.439100
        Last Date:        2019-12-24T06:48:50.548423
        Number of Sales:  160
        Sales / Day:      1.10
        High:             2699.00 CNY 385.80 USD
        Low:              1819.00 CNY 260.01 USD
        First:            1859.00 CNY 265.73 USD
        Last:             2109.00 CNY 301.46 USD
        Average:          2167.75 CNY 309.86 USD
        Stdev:            178.92

This would indicate filtering and sorting by mid-to-last can be quite misleading.

We suspect our strategy to be inherently biased towards more risky new releases

./strategy.py --start_from ../feed/merged.20191225.csv | grep "Release date" | tr -s ' ' | cut -d' ' -f 3 | sort
2008-11-28
2017-01-28
2017-06-10
2017-08-05
2017-10-07
2017-10-07
2017-11-21
2018-09-05
2019-01-22
2019-06-10
2019-08-24
2019-10-25
2019-11-07
2019-11-30
2019-12-06
2019-12-06
2019-12-07
2019-12-07
2019-12-07

As a rough estimate, using the default cut-off ratios, more than half was from this year, and among those tilted towards those just released and started trading a few days ago.
Our belief is that the new issues generally "stabilize" to a price lower than the trading price of the first few days, and this stabilization period would be shorter than our holding time, meaning capturing the difference in new issues can be tricky.

I'm bearish due to volatility for automated bids.
Backtesting (sim) would require non-trivial implementation effort, and due to data scarcity I'd be doubtful whether we can derive meaningful conclusion.
Keep gathering data wouldn't hurt, but I tend to think it too risky to automate bids: we aren't disciplined enough and don't have enough data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions