Normalization Pipeline · The Build Stage
Data Refinery
The Machine Data Insights Pipeline
Refines raw machine data into validated, CIM-compliant Splunk Technical Add-ons and Cribl packs: packaged, documented, and ready to deploy. Read-only exports go in; deployment-ready artifacts come out.
Turning Data Into Gold™
What it does
From raw export to deployable add-on.
Data Refinery is the engine MDI uses to produce accurate, timely CIM
normalization artifacts. It profiles fields and sample events,
classifies each sourcetype against the relevant CIM data models, and
generates the props and transforms that
normalize vendor fields to CIM - then compiles, validates, and
packages the result. Where an assessment finds the gaps, Data Refinery
builds the fix.
Quality gated, not hand-finished.
Every add-on is linted for CIM compliance, scored for deployment
readiness, and run through Splunk AppInspect - cloud or local
- before it ships. The output is a packaged .tgz
with auto-generated documentation, so what you receive is ready to
deploy, not a starting point someone still has to finish.
The same pipeline produces Cribl packs, and compares add-on versions at the configuration level so upgrades never silently break CIM coverage.
Splunk TAs tab - compile, validate, package
Inside the pipeline
What it automates.
Six stages turn read-only exports into deployment-ready, CIM-compliant artifacts.
Discover & profile
Imports the sourcetype inventory and read-only field/sample exports, then profiles every field so you know what's present before any mapping begins.
Classify to CIM
Classifies each sourcetype against the relevant CIM data models, so normalization targets the models Enterprise Security actually uses.
Map with the KB
Draws on a knowledge base of vendor-to-CIM field mappings and existing Splunkbase TAs, scoring coverage and reusing proven mappings instead of starting cold.
Compile the add-on
Generates the props, transforms, eventtypes, and tags that normalize vendor fields to CIM, and compiles them into a complete Technical Add-on.
Validate & gate
Lints every add-on for CIM compliance, scores deployment readiness, and runs Splunk AppInspect (cloud or local) before anything ships.
Package & document
Packages a deployable .tgz, auto-generates Word documentation, and produces Cribl packs - ready to deploy, not finish.
Have a backlog of sourcetypes to normalize?
Data Refinery is how MDI delivers CIM normalization at scale - assessed, sanitized, and built into add-ons your team can deploy.
Start a Conversation →