← All tools & services

Data Refinery

The Machine Data Insights Pipeline

Assess - CAT Sanitize - Paydirt Build - Data Refinery

Refines raw machine data into validated, CIM-compliant Splunk Technical Add-ons and Cribl packs: packaged, documented, and ready to deploy. Read-only exports go in; deployment-ready artifacts come out.

Data Refinery: the MDI CIM normalization pipeline refining raw machine data into deployable Splunk and Cribl artifacts

Turning Data Into Gold

From raw export to deployable add-on.

Data Refinery is the engine MDI uses to produce accurate, timely CIM normalization artifacts. It profiles fields and sample events, classifies each sourcetype against the relevant CIM data models, and generates the props and transforms that normalize vendor fields to CIM - then compiles, validates, and packages the result. Where an assessment finds the gaps, Data Refinery builds the fix.

Quality gated, not hand-finished.

Every add-on is linted for CIM compliance, scored for deployment readiness, and run through Splunk AppInspect - cloud or local - before it ships. The output is a packaged .tgz with auto-generated documentation, so what you receive is ready to deploy, not a starting point someone still has to finish.

The same pipeline produces Cribl packs, and compares add-on versions at the configuration level so upgrades never silently break CIM coverage.

Data Refinery Splunk TAs tab: compiling a sourcetype selection into a CIM-compliant Technical Add-on, with validation and packaging

Splunk TAs tab - compile, validate, package

What it automates.

Six stages turn read-only exports into deployment-ready, CIM-compliant artifacts.

Discover & profile

Imports the sourcetype inventory and read-only field/sample exports, then profiles every field so you know what's present before any mapping begins.

Classify to CIM

Classifies each sourcetype against the relevant CIM data models, so normalization targets the models Enterprise Security actually uses.

Map with the KB

Draws on a knowledge base of vendor-to-CIM field mappings and existing Splunkbase TAs, scoring coverage and reusing proven mappings instead of starting cold.

Compile the add-on

Generates the props, transforms, eventtypes, and tags that normalize vendor fields to CIM, and compiles them into a complete Technical Add-on.

Validate & gate

Lints every add-on for CIM compliance, scores deployment readiness, and runs Splunk AppInspect (cloud or local) before anything ships.

Package & document

Packages a deployable .tgz, auto-generates Word documentation, and produces Cribl packs - ready to deploy, not finish.

How it's delivered: Data Refinery is MDI-operated - you receive deployment-ready Splunk TAs and Cribl packs, validated against CIM, checked by AppInspect, and fully documented. The engine is the capability; the artifacts are the deliverable.

Have a backlog of sourcetypes to normalize?

Data Refinery is how MDI delivers CIM normalization at scale - assessed, sanitized, and built into add-ons your team can deploy.

Start a Conversation →