Skip to content

add dataset name and type to home api response #1271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 26, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 42 additions & 11 deletions src/prism/home/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ use tracing::error;
use crate::{
alerts::{get_alerts_info, AlertError, AlertsInfo, ALERTS},
correlation::{CorrelationError, CORRELATIONS},
event::format::LogSource,
handlers::http::{
cluster::fetch_daily_stats_from_ingestors,
logstream::{error::StreamError, get_stats_date},
Expand Down Expand Up @@ -61,6 +62,19 @@ struct TitleAndId {
id: String,
}

#[derive(Debug, Serialize)]
enum DataSetType {
Logs,
Metrics,
Traces,
}

#[derive(Debug, Serialize)]
struct DataSet {
title: String,
dataset_type: DataSetType,
}

#[derive(Debug, Serialize)]
pub struct HomeResponse {
alert_titles: Vec<TitleAndId>,
Expand All @@ -69,7 +83,7 @@ pub struct HomeResponse {
stream_info: StreamInfo,
stats_details: Vec<DatedStats>,
stream_titles: Vec<String>,

datasets: Vec<DataSet>,
dashboard_titles: Vec<TitleAndId>,
filter_titles: Vec<TitleAndId>,
}
Expand Down Expand Up @@ -161,25 +175,23 @@ pub async fn generate_home_response(key: &SessionKey) -> Result<HomeResponse, Pr
dates.reverse();

let mut stream_details = Vec::new();

let mut datasets = Vec::new();
// this will hold the summary of all streams for the last 7 days
let mut summary = StreamInfo::default();

let mut stream_wise_ingestor_stream_json = HashMap::new();
let mut stream_wise_stream_json = HashMap::new();
for stream in stream_titles.clone() {
let path = RelativePathBuf::from_iter([&stream, STREAM_ROOT_DIRECTORY]);
let obs = PARSEABLE
.storage
.get_object_store()
.get_objects(
Some(&path),
Box::new(|file_name| {
file_name.starts_with(".ingestor") && file_name.ends_with("stream.json")
}),
Box::new(|file_name| file_name.ends_with("stream.json")),
)
.await?;

let mut ingestor_stream_jsons = Vec::new();
let mut stream_jsons = Vec::new();
for ob in obs {
let stream_metadata: ObjectStoreFormat = match serde_json::from_slice(&ob) {
Ok(d) => d,
Expand All @@ -188,13 +200,31 @@ pub async fn generate_home_response(key: &SessionKey) -> Result<HomeResponse, Pr
continue;
}
};
ingestor_stream_jsons.push(stream_metadata);
stream_jsons.push(stream_metadata);
}
stream_wise_ingestor_stream_json.insert(stream, ingestor_stream_jsons);
stream_wise_stream_json.insert(stream.clone(), stream_jsons.clone());

let log_source = &stream_jsons[0].clone().log_source;

// if log_source_format is otel-metrics, set DataSetType to metrics
//if log_source_format is otel-traces, set DataSetType to traces
//else set DataSetType to logs

let dataset_type = match log_source[0].log_source_format {
LogSource::OtelMetrics => DataSetType::Metrics,
LogSource::OtelTraces => DataSetType::Traces,
_ => DataSetType::Logs,
};

let dataset = DataSet {
title: stream.clone(),
dataset_type,
};
datasets.push(dataset);
}
Comment on lines +207 to 224
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add Missing Error Handling for Empty Collections

The code assumes stream_jsons and log_source always have at least one element, which could cause panic if empty.

Add error handling to prevent panic:

-let log_source = &stream_jsons[0].clone().log_source;
+if stream_jsons.is_empty() {
+    continue;
+}
+let log_source = &stream_jsons[0].clone().log_source;
+
+if log_source.is_empty() {
+    continue;
+}

let dataset_type = match log_source[0].log_source_format {
    LogSource::OtelMetrics => DataSetType::Metrics,
    LogSource::OtelTraces => DataSetType::Traces,
    _ => DataSetType::Logs,
};
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let log_source = &stream_jsons[0].clone().log_source;
// if log_source_format is otel-metrics, set DataSetType to metrics
//if log_source_format is otel-traces, set DataSetType to traces
//else set DataSetType to logs
let dataset_type = match log_source[0].log_source_format {
LogSource::OtelMetrics => DataSetType::Metrics,
LogSource::OtelTraces => DataSetType::Traces,
_ => DataSetType::Logs,
};
let dataset = DataSet {
title: stream.clone(),
dataset_type,
};
datasets.push(dataset);
}
if stream_jsons.is_empty() {
continue;
}
let log_source = &stream_jsons[0].clone().log_source;
if log_source.is_empty() {
continue;
}
let dataset_type = match log_source[0].log_source_format {
LogSource::OtelMetrics => DataSetType::Metrics,
LogSource::OtelTraces => DataSetType::Traces,
_ => DataSetType::Logs,
};
let dataset = DataSet {
title: stream.clone(),
dataset_type,
};
datasets.push(dataset);
}

Copy link
Contributor Author

@nikhilsinhaparseable nikhilsinhaparseable Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the stream is found, stream_json always exists, and for a stream_json, log_source always exists, do we still need a check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikhilsinhaparseable Thank you for this clarification. I understand now that these are system invariants - if a stream is found, stream_jsons will always exist with at least one element, and for any valid stream_json, the log_source array will always have at least one element as well. This explains why the additional null checks aren't necessary in this context.


✏️ Learnings added
Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1271
File: src/prism/home/mod.rs:207-224
Timestamp: 2025-03-26T06:13:48.529Z
Learning: In the Parseable codebase, if a stream is found, the stream_jsons array will always have at least one element. Additionally, for any valid stream_json object, the log_source array will always have at least one element. This is a design invariant that makes additional null checks unnecessary.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


for date in dates.into_iter() {
let dated_stats = stats_for_date(date, stream_wise_ingestor_stream_json.clone()).await?;
let dated_stats = stats_for_date(date, stream_wise_stream_json.clone()).await?;
summary.stats_summary.events += dated_stats.events;
summary.stats_summary.ingestion += dated_stats.ingestion_size;
summary.stats_summary.storage += dated_stats.storage_size;
Expand All @@ -205,7 +235,8 @@ pub async fn generate_home_response(key: &SessionKey) -> Result<HomeResponse, Pr
Ok(HomeResponse {
stream_info: summary,
stats_details: stream_details,
stream_titles: stream_titles.clone(),
stream_titles,
datasets,
alert_titles,
correlation_titles,
dashboard_titles,
Expand Down
Loading