Concurrency plans? #46
Replies: 4 comments 4 replies
-
Can you tell me more about your use case? You will be fetching data from a remote API, storing it in your client with TinyFlux, and performing rolling calculations on your client as well? How often are you fetching from the remote API? If you aren't fetching at least once per second or faster, it should not be a problem to perform all write and read operations synchronously with the TinyFlux API as it is. |
Beta Was this translation helpful? Give feedback.
-
Ok I think I know what you are doing. I'm not familiar with New Relic but
it looks like they have "intelligent" alert monitoring that should
essentially be making alert thresholds that are dynamic in nature:
https://newrelic.com/platform/alerts
Does this address your use case?
As for TinyFlux, you can use it in the manner you described. If you are
fetching from New Relic, awaiting all requests, calculating thresholds, and
then caching data every 5 minutes you should not need to write to TinyFlux
in an async manner. You are awaiting all entities first right? Then you
would just be writing one data point to TinyFlux? A Point with a key/val
pair for each entity with a time stamp for that 5 minute interval...
How many entities per interval? Thousands?
…On Fri, Dec 29, 2023 at 10:23 PM devinnasar ***@***.***> wrote:
I'm using the New Relic graphql API (Nerdgraph). At the 10k ft. level, I'm
estimating what alert thresholds should be for each of several thousand
entities based on historical time series data for each entity. The idea is
to procedurally generate sane thresholds based on the performance of the
actual infrastructure. If we were quiet for the last 3 months, thresholds
should be low, but if we had an incident, our thresholds should increase.
I'm not ready to graduate to provisioning a full TSDB right now. The alert
threshold file artifacts I'm generating are being used in Terraform that
provisions Alert Conditions, and the cost of integrating something like
full Influx is not affordable at present.
- For a given New Relic account, locate all entity types among
entities possessing specific tags (entities which belong to an IDed
software component)
- for each entity type, run up to 5 golden metric queries
- for each golden metric query
- store time series data
- create a pandas dataframe with all new and previously
acquired timeseries data
- calculate a critical and warning threshold for the golden
metric for that entity type
- write all golden metric thresholds for the entity type to file
- The result for example tells us what our thresholds should be
for all lambdas belonging to component X
- Doing this in a reasonable time frame involves asynchronously
querying Nerdgraph for the entity data in parallel. Nerdgraph also has
several situations where it asks you to 'call it back later' for long
running queries that run server side. Asynchronously checking for these
queries to be finished is a requirement. I'm using asyncio for this.
- When I retrieve time series data for an entity I need to store it
and move on. Each of my coroutines would need to be able to do this in a
non blocking way.
- The goal is to gather the data for all entities in parallel before
performing the alert threshold estimation with pandas.
- Once estimation has been run on the time series, a file is written
for each entity containing the calculated thresholds. Ideally this would be
done asynchronously as well.
- I'd like my process to run once per 24 hours at least, collecting
data for all several thousand entities and adding time series data to the
previously collected data. The idea is that as entities manifest to New
Relic, we start to collect data for them and creating a rolling estimate of
the alert threshold. We need to run the process frequently because New
Relic calculates aggregate functions over a limited number of 'buckets' in
the query period. Querying over too long of a time range (for example, a
week) will cause the alert threshold estimation to have a very poor
resolution (we want to get within 5 minute windows).
- We will also be dropping time series data older than 3 months.
Currently in other async projects I'm using aiotinydb to accomplish all
of my coroutines accessing the data store without blocking each other. The
estimation process I've described doesn't attempt to cache data, it tries
to ask Nerdgraph for data across a window of 3 months into the past, which
is skewing my data. I'm looking at tinyflux as a way to go from querying a
large window infrequently with no caching to querying a small window
routinely and caching as a way to grow the time series to the required 3
month period.
Thanks for taking the time to read this. Any advice you might have would
be appreciated.
—
Reply to this email directly, view it on GitHub
<#46 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4EXVIVH5UFLXXSSTWABYDYL6QNLAVCNFSM6AAAAABBF7HSBOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TSNZXGE4TE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
You could use a worker thread with a shared queue to do this I believe,
each of your async queries should fetch the data and then write to a shared
python Queue, and your worker thread could batch the writes to TinyFlux:
https://github.com/citrusvanilla/tinyflux/blob/master/examples/3_iot_datastore_with_mqtt.py
Though if you really have 1000 * 5 * 300 = 1,500,000 data points every 5
minutes and each key/val pair is 20 bytes, you're looking at 30mb every 5
minutes. Times that by 12, then 24 hours then 90 days, you have about
750GB of data. That is entirely too much for TinyFlux. Even if you do
this once per day, it's 30mb * 90 days = 2.7GB. This is at the very limit
of what TinyFlux is capable of.
…On Sat, Dec 30, 2023 at 3:46 PM devinnasar ***@***.***> wrote:
To answer your last question: it's thousands of entities * 4 or 5 queries,
* 344 data points per query, on each run of the program. The goal is to run
the program once per day to get a resolution of 5 minute aggregation
windows (25hrs/344 windows). Each data point needs to be cached in tinyflux
—
Reply to this email directly, view it on GitHub
<#46 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4EXVLCFREHZG7NPCBCSF3YMCKTNAVCNFSM6AAAAABBF7HSBOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TSOBQHA3DI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Okay well whatever you're up to, I'm not here to architect your pipeline
for you. I won't get around to making TinyFlux writes async under the hood
anytime soon but you can make writes async yourself with a wrapper. I'll
add this as a future feature.
…On Sun, Dec 31, 2023 at 12:24 AM devinnasar ***@***.***> wrote:
You're mistaken about the 'every 5 minutes' part. I'm not saying I'm
running this process every 5 minutes. I'm saying I'm running it once per 24
hours. The maximum number of aggregation windows in a New Relic query is
344, meaning if I'm querying a time range starting at 00:00 and ending at
11:59, that gives me 344 windows of roughly 4.1 minutes each, rounded up to
windows of 5 minutes. That means I'll have 344 data points coming in for
each entity type and golden metric every 24 hours.
—
Reply to this email directly, view it on GitHub
<#46 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4EXVKCMNGIHMFIKF72UT3YMEHMRAVCNFSM6AAAAABBF7HSBOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TSOBRHEZDG>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hello,
I'm curious if the project will add concurrency features similar to this project: https://github.com/aiotinydb/aiotinydb. I'm interested in using tinyflux for caching large amounts of timeseries data from a remote API for performing rolling calculations over that data. My original plan was to use aiotinydb, but the article explaining how large databases can increase write times concerned me.
Beta Was this translation helpful? Give feedback.
All reactions