About Me

My name is James H. Baxter – Jim – and I’m the founder of Machine Data Insights, Inc., an independent security data engineering consultancy based in Winter Springs, Florida.

I’ve spent the better part of 45 years building solutions: in the Air Force, at MCI, IBM, BP, Daou Systems, Disney, and more recently at enterprises running Splunk as their core security platform. The through-line across all of it has been automation – finding the manual, time-consuming work that nobody enjoys and building tools that eliminate it.

These days my focus is narrow by design: CIM normalization, security data integrity, and the AI-powered tooling that makes both dramatically faster and more reliable. I hold Splunk certifications as Core Consultant, Architect, and Developer, and I work directly in client environments – not through a bench of consultants or a project management layer.

The recent development I’m most proud of: CIM normalization that used to take weeks of manual work now takes a few hours to a few days – AI driven, fully automated, with a human in the loop. That’s the outcome of building the right tools and processes and refining them across real enterprise deployments.

This blog is where I write about the work – the problems I’m solving, the tools I’m building, and the broader landscape of security data engineering, AI/ML, and Splunk. If it’s here, it came from something I actually built or learned firsthand.

WHAT I’VE BUILT

The CIM Assessment Toolkit (CAT) will soon be available on Splunkbase – a free app for assessing CIM health, field coverage, and data model acceleration status across your Splunk environment. If you’re running Enterprise Security and aren’t sure whether your data models are actually healthy, you’ll want to start there. – Splunkbase release coming Q2 2026

Beyond CAT, I’ve developed the ‘Paydirt’ log scrubber (https://github.com/machinedatainsights/paydirt) for scrubbing sensitive data (CUI, PII, and credential redaction) from Splunk log exports so they can be safely imported into the MDI ‘Data Refinery’ for end-to-end automated CIM normalization.

The Data Refinery will soon support identifying just those fields you need to send to Splunk and generating Cribl packs to route log data to long-term storage and Splunk, as well as performing CIM normalization in Cribl instead of relying on Splunk TAs.

Also under development or refinement is the ‘DataGen’ artifical log event generator, a refactored ‘Data Source Integrity Monitor (DSIM)’ for ML-based pipeline health monitoring, and a Performance and Capacity Analytics Splunk app for day-to-day resource utilization and performance monitoring and trending to support capacity planning. These are the tools I use in client engagements.

PUBLISHED WORK

I’ve written two technical books:

Splunk 7.x Quick Start Guide – a practical reference for architecting and administering Splunk that I still use regularly myself.
https://www.amazon.com/Splunk-7-x-Quick-Start-Guide/dp/1789531098

Wireshark Essentials – covering packet analysis, protocols, and network traffic interpretation. Somewhat dated on the Wireshark version but the fundamentals hold.
https://www.amazon.com/Wireshark-Essentials-James-H-Baxter/dp/1783554630

CONNECT

LinkedIn: https://www.linkedin.com/in/jameshbaxter/
MDI Website: https://machinedatainsights.com/#contact
Email: jim.baxter@machinedatainsights.com

If you’re working through a CIM normalization problem, dealing with data quality issues in ES, or just want to talk through a Splunk architecture challenge – reach out.