Skip to content

Data truncation occurs in TCP input -> File output pipeline, but not in File input/output #115

@datasc31

Description

@datasc31

Description:
When using WarpParse to parse a specific class of firewall activity logs, data truncation occurs in the TCP input -> File output pipeline. The same log data does NOT exhibit truncation issues when using File input -> File output.

This indicates a potential issue in the TCP ingestion or stream handling logic, rather than in the parser itself or file-based I/O.

Environment:

  • Engine: WarpParse
  • Input (problematic): TCP
  • Output: File
  • Input (non-problematic): File
  • Output: File
  • Log format: Plain text, single-line, pipe-delimited key-value pairs
  • Transport: TCP (sender pushes raw log lines)
  • Warpparse Version:0.15.1

Reproduction Steps:

  1. Configure WarpParse with:
    • TCP as input
    • File as output
  2. Send the following log line via TCP (one event per line):
# raw message
   2018 01 30 13:12:21 Security +01:00 Block: type=FWD|action=BLOCK|proto=UDP|ipVersion=4|srcIF=eth0|srcZone=LAN|srcIP=10.17.34.12|srcPort=54915|srcMAC=18:db:f2:13:ca:9c|srcNAT=0.0.0.0|srcCountry=CN|srcASN=4134|dstIF=eth1|dstZone=WAN|dstIP=10.17.34.255|dstPort=54915|dstService=custom_udp|dstNAT=0.0.0.0|dstCountry=US|dstASN=15169|rule=BLOCKALL|ruleType=FirewallPolicy|policyId=policy_10234|policyGroup=corp_default|interfaceGroup=internal_to_external|routingTable=main|connState=NEW|sessionId=843920184|sessionDuration=0|receivedPackets=0|sentPackets=0|receivedBytes=0|sentBytes=0|packetCount=1|byteCount=60|tcpFlags=|icmpType=|icmpCode=|application=unknown|applicationRisk=0|applicationCategory=unclassified|protocolDetection=enabled|detectedProtocol=udp_generic|user=john.doe|userGroup=employees|authMethod=ldap|vpnType=none|vpnTunnelId=0|contentProfile=default|contentAction=none|url=http://example.com/resource|urlCategory=uncategorized|threatLevel=0|threatName=none|signatureId=0|signatureVersion=0|limitProfile=default|rateLimitPps=1200|rateLimitBps=768000|logSource=box_firewall_activity|logType=activity|logVersion=1
  1. Observe the output file generated by WarpParse.

Expected Behavior:

  • The entire log line should be preserved and parsed correctly.
  • Output content should be identical (or structurally equivalent after parsing) to the input log line.

Actual Behavior:

  • The output log is truncated.
  • Part of the log line (typically near the end) is missing.
  • The truncation only occurs when using TCP input.
  • The issue does NOT occur when:
    • File input -> File output
    • File input -> parsing -> File output

Evidence:

  • Full original log: provided above
  • Truncated output log:
# base64
MDEgMzAgMTM6MTI6MjEgU2VjdXJpdHkgKzAxOjAwIEJsb2NrOiB0eXBlPUZXRHxhY3Rpb249QkxPQ0t8cHJvdG89VURQfGlwVmVyc2lvbj00fHNyY0lGPWV0aDB8c3JjWm9uZT1MQU58c3JjSVA9MTAuMTcuMzQuMTJ8c3JjUG9ydD01NDkxNXxzcmNNQUM9MTg6ZGI6ZjI6MTM6Y2E6OWN8c3JjTkFUPTAuMC4wLjB8c3JjQ291bnRyeT1DTnxzcmNBU049NDEzNHxkc3RJRj1ldGgxfGRzdFpvbmU9V0FOfGRzdElQPTEwLjE3LjM0LjI1NXxkc3RQb3J0PTU0OTE1fGRzdFNlcnZpY2U9Y3VzdG9tX3VkcHxkc3ROQVQ9MC4wLjAuMHxkc3RDb3VudHJ5PVVTfGRzdEFTTj0xNTE2OXxydWxlPUJMT0NLQUxMfHJ1bGVUeXBlPUZpcmV3YWxsUG9saWN5fHBvbGljeUlkPXBvbGljeV8xMDIzNHxwb2xpY3lHcm91cD1jb3JwX2RlZmF1bHR8aW50ZXJmYWNlR3JvdXA9aW50ZXJuYWxfdG9fZXh0ZXJuYWx8cm91dGluZ1RhYmxlPW1haW58Y29ublN0YXRlPU5FV3xzZXNzaW9uSWQ9ODQzOTIwMTg0fHNlc3Npb25EdXJhdGlvbj0wfHJlY2VpdmVkUGFja2V0cz0wfHNlbnRQYWNrZXRzPTB8cmVjZWl2ZWRCeXRlcz0wfHNlbnRCeXRlcz0wfHBhY2tldENvdW50PTF8Ynl0ZUNvdW50PTYwfHRjcEZsYWdzPXxpY21wVHlwZT18aWNtcENvZGU9fGFwcGxpY2F0aW9uPXVua25vd258YXBwbGljYXRpb25SaXNrPTB8YXBwbGljYXRpb25DYXRlZ29yeT11bmNsYXNzaWZpZWR8cHJvdG9jb2xEZXRlY3Rpb249ZW5hYmxlZHxkZXRlY3RlZFByb3RvY29sPXVkcF9nZW5lcmljfHVzZXI9am9obi5kb2V8dXNlckdyb3VwPWVtcGxveWVlc3xhdXRoTWV0aG9kPWxkYXB8dnBuVHlwZT1ub25lfHZwblR1bm5lbElkPTB8Y29udGVudFByb2ZpbGU9ZGVmYXVsdHxjb250ZW50QWN0aW9uPW5vbmV8dXJsPWh0dHA6Ly9leGFtcGxlLmNvbS9yZXNvdXJjZXx1cmxDYXRlZ29yeT11bmNhdGVnb3JpemVkfHRocmVhdExldmVsPTB8dGhyZWF0TmFtZT1ub25lfHNpZ25hdHVyZUlkPTB8c2lnbmF0dXJlVmVyc2lvbj0wfGxpbWl0UHJvZmlsZT1kZWZhdWx0fHJhdGVMaW1pdFBwcz0xMjAwfHJhdGVMaW1pdEJwcz03NjgwMDB8bG9nU291cmNlPWJveF9maXJld2FsbF9hY3Rpdml0eXxsb2dUeXBlPWFjdGl2aXR5fGxvZ1ZlcnNpb249MQoyMDE4IDAxIDMwIDEzOjEyOjIxIFNlY3VyaXR5ICswMTowMCBCbG9jazogdHlwZT1GV0R8YWN0aW9uPUJMT0NLfHByb3RvPVVEUHxpcFZlcnNpb249NHxzcmNJRj1ldGgwfHNyY1pvbmU9TEFOfHNyY0lQPTEwLjE3LjM0LjEyfHNyY1BvcnQ9NTQ5MTV8c3JjTUFDPTE4OmRiOmYyOjEzOmNhOjljfHNyY05BVD0wLjAuMC4wfHNyY0NvdW50cnk9Q058c3JjQVNOPTQxMzR8ZHN0SUY9ZXRoMXxkc3Rab25lPVdBTnxkc3RJUD0xMC4xNy4zNC4yNTV8ZHN0UG9ydD01NDkxNXxkc3RTZXJ2aWNlPWN1c3RvbV91ZHB8ZHN0TkFUPTAuMC4wLjB8ZHN0Q291bnRyeT1VU3xkc3RBU049MTUxNjl8cnVsZT1CTE9DS0FMTHxydWxlVHlwZT1GaXJld2FsbFBvbGljeXxwb2xpY3lJZD1wb2xpY3lfMTAyMzR8cG9saWN5R3JvdXA9Y29ycF9kZWZhdWx0fGludGVyZmFjZUdyb3VwPWludGVybmFsX3RvX2V4dGVybmFsfHJvdXRpbmdUYWJsZT1tYWlufGNvbm5TdGF0ZT1ORVd8c2Vzc2lvbklkPTg0MzkyMDE4NHxzZXNzaW9uRHVyYXRpb249MHxyZWNlaXZlZFBhY2tldHM9MHxzZW50UGFja2V0cz0wfHJlY2VpdmVkQnl0ZXM9MHxzZW50Qnl0ZXM9MHxwYWNrZXRDb3VudD0xfGJ5dGVDb3VudD02MHx0Y3BGbGFncz18aWNtcFR5cGU9fGljbXBDb2RlPXxhcHBsaWNhdGlvbj11bmtub3dufGFwcGxpY2F0aW9uUmlzaz0wfGFwcGxpY2F0aW9uQ2F0ZWdvcnk9dW5jbGFzc2lmaWVkfHByb3RvY29sRGV0ZWN0aW9uPWVuYWJsZWR8ZGV0ZWN0ZWRQcm90b2NvbD11ZHBfZ2VuZXJpY3x1c2VyPWpvaG4uZG9lfHVzZXJHcm91cD1lbXBsb3llZXN8YXV0aE1ldGhvZD1sZGFwfHZwblR5cGU9bm9uZXx2cG5UdW5uZWxJZD0wfGNvbnRlbnRQcm9maWxlPWRlZmF1bHR8Y29udGVudEFjdGlvbj1ub25lfHVybD1odHRwOi8vZXhhbXBsZS4=

# base64
0ZWdvcml6ZWR8dGhyZWF0TGV2ZWw9MHx0aHJlYXROYW1lPW5vbmV8c2lnbmF0dXJlSWQ9MHxzaWduYXR1cmVWZXJzaW9uPTB8bGltaXRQcm9maWxlPWRlZmF1bHR8cmF0ZUxpbWl0UHBzPTEyMDB8cmF0ZUxpbWl0QnBzPTc2ODAwMHxsb2dTb3VyY2U9Ym94X2ZpcmV3YWxsX2FjdGl2aXR5fGxvZ1R5cGU9YWN0aXZpdHl8bG9nVmVyc2lvbj0x
# base64 decode
01 30 13:12:21 Security +01:00 Block: type=FWD|action=BLOCK|proto=UDP|ipVersion=4|srcIF=eth0|srcZone=LAN|srcIP=10.17.34.12|srcPort=54915|srcMAC=18:db:f2:13:ca:9c|srcNAT=0.0.0.0|srcCountry=CN|srcASN=4134|dstIF=eth1|dstZone=WAN|dstIP=10.17.34.255|dstPort=54915|dstService=custom_udp|dstNAT=0.0.0.0|dstCountry=US|dstASN=15169|rule=BLOCKALL|ruleType=FirewallPolicy|policyId=policy_10234|policyGroup=corp_default|interfaceGroup=internal_to_external|routingTable=main|connState=NEW|sessionId=843920184|sessionDuration=0|receivedPackets=0|sentPackets=0|receivedBytes=0|sentBytes=0|packetCount=1|byteCount=60|tcpFlags=|icmpType=|icmpCode=|application=unknown|applicationRisk=0|applicationCategory=unclassified|protocolDetection=enabled|detectedProtocol=udp_generic|user=john.doe|userGroup=employees|authMethod=ldap|vpnType=none|vpnTunnelId=0|contentProfile=default|contentAction=none|url=http://example.com/resource|urlCategory=uncategorized|threatLevel=0|threatName=none|signatureId=0|signatureVersion=0|limitProfile=default|rateLimitPps=1200|rateLimitBps=768000|logSource=box_firewall_activity|logType=activity|logVersion=1
2018 01 30 13:12:21 Security +01:00 Block: type=FWD|action=BLOCK|proto=UDP|ipVersion=4|srcIF=eth0|srcZone=LAN|srcIP=10.17.34.12|srcPort=54915|srcMAC=18:db:f2:13:ca:9c|srcNAT=0.0.0.0|srcCountry=CN|srcASN=4134|dstIF=eth1|dstZone=WAN|dstIP=10.17.34.255|dstPort=54915|dstService=custom_udp|dstNAT=0.0.0.0|dstCountry=US|dstASN=15169|rule=BLOCKALL|ruleType=FirewallPolicy|policyId=policy_10234|policyGroup=corp_default|interfaceGroup=internal_to_external|routingTable=main|connState=NEW|sessionId=843920184|sessionDuration=0|receivedPackets=0|sentPackets=0|receivedBytes=0|sentBytes=0|packetCount=1|byteCount=60|tcpFlags=|icmpType=|icmpCode=|application=unknown|applicationRisk=0|applicationCategory=unclassified|protocolDetection=enabled|detectedProtocol=udp_generic|user=john.doe|userGroup=employees|authMethod=ldap|vpnType=none|vpnTunnelId=0|contentProfile=default|contentAction=none|url=http://example.

#base64 decode
com/resource|urlCategory=uncategorized|threatLevel=0|threatName=none|signatureId=0|signatureVersion=0|limitProfile=default|rateLimitPps=1200|rateLimitBps=768000|logSource=box_firewall_activity|logType=activity|logVersion=1

Observations:

  • The issue appears to be specific to TCP-based ingestion.
  • The same parsing rules and output configuration work correctly with File input.
  • This suggests a possible problem with:

Additional Notes:

  • The log line is relatively long and contains many key-value pairs.
  • Each log event is expected to be a single line.
  • No similar truncation has been observed in file-based pipelines.
# wpl
package /Firewall/{
    rule Firewall{
        (
          chars:timestamp\S,
          2*_,
          kv()| (*kv()\|),
        )
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions