Skip to content

Conversation

@nflexfo
Copy link
Contributor

@nflexfo nflexfo commented Nov 26, 2025

One line description of pull request

Add a new data_type_count attribute container for data types into the storage file.

Description:

(Note that this MR is part of #5016)

Data type can be used as en Event Filter expression in psort.py. However, there is no way, without any a posteriori knowledge, to know what actual values can be used in the expression. Adding data type counter into the storage file and printing them with pinfo will allow user to do it.

With the patch, the text output format, the output looks like:

$ ./plaso/scripts/pinfo.py 20251126T082737-apache_access.log.plaso --output_format text

************************** Plaso Storage Information ***************************
            Filename : 20251126T082737-apache_access.log.plaso
      Format version : 20230327
Serialization format : json
--------------------------------------------------------------------------------

*********************************** Sessions ***********************************
66268194-c93d-4f0d-9667-2c6caefe3d4a : 2025-11-26T07:27:37.930296+00:00
--------------------------------------------------------------------------------

******************************** Event sources *********************************
Total : 1
--------------------------------------------------------------------------------

************************ Events generated per data type ************************
         Data type name : Number of events
--------------------------------------------------------------------------------
apache:access_log:entry : 14
                fs:stat : 3
                  Total : 17
--------------------------------------------------------------------------------

************************* Events generated per parser **************************
Parser (plugin) name : Number of events
--------------------------------------------------------------------------------
       apache_access : 14
            filestat : 3
               Total : 17
--------------------------------------------------------------------------------

No events labels stored.

******************* Extraction warnings generated per parser *******************
Parser (plugin) name : Number of warnings
--------------------------------------------------------------------------------
  text/apache_access : 1
--------------------------------------------------------------------------------

************** Path specifications with most extraction warnings ***************
Number of warnings : Pathspec
--------------------------------------------------------------------------------
                 1 : type: OS, location:
                     /plaso/test_data/apache_access.log
--------------------------------------------------------------------------------

No analysis reports stored.

The markdown format:

# Plaso Storage Information

<table>
<tr><th nowrap style="text-align:left;vertical-align:top">Filename</th><td>20251126T082737-apache_access.log.plaso</td></tr>
<tr><th nowrap style="text-align:left;vertical-align:top">Format version</th><td>20230327</td></tr>
<tr><th nowrap style="text-align:left;vertical-align:top">Serialization format</th><td>json</td></tr>
</table>

## Sessions

<table>
<tr><th nowrap style="text-align:left;vertical-align:top">66268194-c93d-4f0d-9667-2c6caefe3d4a</th><td>2025-11-26T07:27:37.930296+00:00</td></tr>
</table>

## Event sources

<table>
<tr><th nowrap style="text-align:left;vertical-align:top">Total</th><td>1</td></tr>
</table>

## Events generated per data type

Data type name | Number of events
--- | ---
apache:access_log:entry | 14
fs:stat | 3
Total | 17

## Events generated per parser

Parser (plugin) name | Number of events
--- | ---
apache_access | 14
filestat | 3
Total | 17

## Event tags generated per label

N/A

### Extraction warnings generated per parser

Parser (plugin) name | Number of warnings
--- | ---
text/apache_access | 1

### Path specifications with most extraction warnings

Number of warnings | Pathspec
--- | ---
1 | type: OS, location: /plaso/test_data/apache_access.log

And the json (pretty-printed with jq):

{
   ...
  "storage_counters": {
    "data_types": {
      "fs:stat": 3,
      "total": 17,
      "apache:access_log:entry": 14
    },
    "parsers": {
      "filestat": 3,
      "total": 17,
      "apache_access": 14
    },
    "event_labels": {},
    "warnings_by_parser": {
      "text/apache_access": 1
    },
    "warnings_by_path_spec": {
      "type: OS, location: /plaso/test_data/apache_access.log\n": 1
    },
    "analysis_reports": {}
  }
}

The MR has currently no test but I'm willing to work on it if you agree with the changes. Note that the storage file format version must be updated as the schema changed. Also, I would rather rewrite some part of this MR if #5014 get merged.

Thanks

Notes:

All contributions to Plaso undergo code review.
This makes sure that the code has appropriate test coverage and conforms to the
Plaso style guide.

One of the maintainers will examine your code, and may request changes. Check off the items below in
order, and then a maintainer will review your code.

Checklist:

  • No new new dependencies are required or l2tdevtools has been updated.
  • Test data has a Plaso compatible license. If the test data was not authored by you (the contributor), make sure to mention its orginal source in ACKNOWLEDGEMENTS.
  • Reviewer assigned.
  • Automated checks (GitHub Actions, AppVeyor) pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant