Researcher & Physicist
Lucas Kotz
Physics PhD · Data Scientist · Simulation and Modeling Specialist
I earned my PhD in Physics from Southern Methodist University, where I developed a strong foundation in data-driven analysis and predictive modeling. These skills have enabled me to tackle complex, data-intensive problems in areas like statistical modeling, uncertainty quantification, high-dimensional optimization, and data-driven theoretical analysis.
Learn about the particle that holds matter together within our universe: the pion. By learning about the internal structure of the pion at high energies, we can learn about the universe we interact with daily.
About
Background & Experience
About Me
I grew up in Valley Stream, NY, where my early curiosity for science was nurtured through museum visits and trips to the observatories across Long Island. That curiosity evolved into a passion for physics during high school, leading me to pursue a degree in physics at the University at Buffalo (UB).
At UB, I was exposed to a broad range of fields — from cosmology to condensed matter to particle physics. I discovered my interest in theoretical particle physics through hands-on labs and independent studies. I even designed a cosmic ray detector, which gave me a deep appreciation for the precision and ambition of high-energy physics experiments. A research experience at UB introduced me to the mathematical structure underlying quantum field theory and solidified my path toward graduate study.
I went on to earn my PhD in physics from Southern Methodist University (SMU), where I focused on studying the internal structure of pion, a particle that mediates the strong force between nucleons (protons and neutrons) at low energies. My research combined theoretical physics, statistical modeling, and computational tools to extract meaningful insights from various experimental datasets and theoretical predictions. I developed a novel approach using Bézier curves to represent parton distribution functions (PDFs), a probability density function of any given partons (constituent particle) within the pion. This allows us to systematically explore model uncertainty that has never been accounted for, specifically the pion in high-energy physics.
Today, I’m looking to apply the same analytical thinking, statistical analysis, coding expertise, and problem-solving mindset to data science, machine learning, and applied research challenges.
Research Interests
Lucas Kotz — Résumé
Full PDF available for download · Last updated 2025
Professional Summary
Data scientist and simulation specialist with a PhD in Physics and active experience supporting large-scale U.S. government experimentation programs. Combines a rigorous academic foundation in statistical modeling, Bayesian inference, and computational analysis with applied expertise in software evaluation, system integration, and technical advising in defense environments. Seeking to apply cross-domain analytical and modeling skills to data science and R&D roles in both defense and private industry.
- Rapidly acquired and applied expertise in complex theoretical frameworks and scientific computing tools, demonstrating adaptability across multiple programming languages and modeling domains.
- Communicated complex research findings to diverse audiences — from specialists to non-technical stakeholders — at international conferences and workshops.
- Currently applying advanced analytical and modeling expertise to defense technology evaluation and large-scale simulation environments.
Work Experience
- Performed contract work supporting a large-scale government experimentation program (300+ participants, 2 weeks), contributing to its transition from a live tabletop format to a fully digital simulated environment.
- Evaluated and vetted simulation engines and analytical software through structured vendor demonstrations, internal assessments, and written technical reports.
- Conducted structured software evaluations through vendor demonstrations, comparative analysis, and written technical assessments.
- Configured and integrated curated software tools onto a centralized platform hosted on a U.S. government network, performing last-mile checks to ensure tools met operational requirements.
- Coordinated with stakeholders across multiple divisions to track deliverables and seamlessly integrate a curated tech environment into the operational program.
- Contributed to the UI/UX development of a centralized platform homepage to improve user navigation and tool accessibility.
- Designed and facilitated training sessions for end users on integrated software tools, improving platform adoption and user readiness.
- Designed and deployed C++ tools for advanced statistical modeling, including parameter estimation, Bézier curve parameterization, and uncertainty quantification.
- Validated and improved models using Bayesian inference, Hessian analysis, Monte Carlo methods, sensitivity analyses, data visualizations, and applied gradient descent optimization and chi-squared minimization techniques.
- Presented technical research to audiences ranging from non-technical to expert, showcasing strong communication and adaptation skills.
- Independently managed projects while proactively collaborating with peers internationally and domestically, initiating communication to ensure project alignment and progress.
- Kept up to date with emerging advances in high-energy physics by regularly reviewing recent publications, preprints, and conference proceedings to inform ongoing research.
- Guided collaborative problem-solving and critical thinking exercises for lab groups within classes of up to 27 students.
- Instructed laboratory techniques and ensured safety for up to 27 students per lab.
Projects
- Improved modeling of pion structure by engineering a C++ module for an open-source theoretical model fitting framework and implementing Bézier curve parameterizations.
- Enhanced uncertainty coverage by 50–300% compared to traditional methods by utilizing Bayesian inference to fit Bézier curves to experimental datasets.
- Validated C++ results against Wolfram Mathematica.
- Produced a published paper and two forthcoming papers.
- Quantified and visualized dataset sensitivity for parton distribution function validation.
- Reduced dataset impact analysis time from weeks/months to hours by pioneering a novel approach.
- Produced a sole-author paper.
- Email spam: Trained a logistic regression model to classify emails as spam or not using publicly available datasets. Evaluated model performance with a confusion matrix. Identified top spam indicators by analyzing the model's coefficients.
- Signal analysis: Processed time-series data from simulated current and magnetic field sensor readings. Parameterized the current signal using curve fitting to model its behavior over time. Estimated signal gain by analyzing relationships between magnetic field measurements and modeled current.
- DFW house price prediction: Developed and evaluated predictive models — including decision trees, linear regression, and neural networks — using publicly available data to estimate average home prices. Assessed model performance and conducted comparative analysis using current housing listings.
Technical Skills
Programming Languages
C++PythonFortran HTMLBashMakeLaTeXPython Libraries
pandasnumpyscikit-learn TensorFlowmatplotlibStatistical Modeling & Analysis
Bayesian InferenceMonte Carlo χ² MinimizationUncertainty Quantification Machine LearningSensitivity Analysis Data VisualizationTools & Platforms
MathematicaJupyterSQL CERN ROOTMATLABVS CodeSystems & Workflow
LinuxGitHPC SLURMgdb / dddSelected Publications
- L. Kotz, A. Courtoy, P. Nadolsky, F. Olness and M. Ponce-Chavez, Analysis of parton distributions in a pion with Bézier parameterizations, Phys. Rev. D 109 (2024) 074027, [arXiv:2311.08447].
- L. Kotz, A study of experimental sensitivities to proton parton distributions with xFitter, [arXiv:2401.11350].
- L. Kotz, A. Courtoy, P. Nadolsky, and M. Ponce-Chavez, Fantômas: epistemic and nuclear uncertainties for the parton distributions of the pion, [arXiv:2505.13594].
- L. Kotz, A. Courtoy, T. J. Hobbs, P. Nadolsky, F. Olness, M. Ponce-Chavez, and V. Purohit, Fantômas Unconfined: global QCD fits with Bézier parameterizations, [arXiv:2507.22969].
Education
Research
Projects & Work
Research Overview
During my PhD at SMU, I focused on analyzing the internal structure of subatomic particles (like the pion) using parton distribution function models. Parton distribution functions (PDFs) are data-driven models derived from high-energy collider experiments, including the currently operating Large Hadron Collider (LHC) and the soon-to-be-built Electron-Ion Collider (EIC). These models describe how a particle’s momentum is distributed among its constituent quarks and gluons — a cornerstone of modern particle physics research.
I developed new modeling approaches for pion PDFs and analyzed how different experimental datasets affected the model’s results. The pion is crucial in holding atomic nuclei together (via the strong force), which is why it’s an important particle to study.
Although my work centers on structured, interpretable alternatives to neural networks, it draws on many of the same statistical foundations, including gradient descent optimization, chi-squared loss minimization, and model reliability techniques. I employed methods such as model verification, bootstrap-based uncertainty estimation, and comparative analysis across multiple parameterizations to assess assess model stability and robustness. This overlap gives me strong cross-compatibility with modern machine learning workflows and tools, and also enabled me to train and evaluate several baseline models as part of my broader data science development.
You can find all of my published work on Inspire HEP.
Fantômas4QCD →
The Fantômas4QCD project resulted in the development of a custom PDF modeling module using Bézier curves as the core technique (essentially using them as universal function approximators). This approach combines the transparency of a simple polynomial model with the flexibility of a neural network, allowing us to explore a much wider range of viable solutions in QCD models.
Lattice QCDC++ / MathematicaL2 Sensitivity →
To accurately interpret theoretical predictions, it's important to understand how individual datasets influence the fitted model. L2 sensitivity quantifies this by calculating how much the chi-squared value changes when PDFs are varied by one standard deviation.
This method highlights which datasets exert the most pull on the fit — helping identify potential outliers, inconsistencies, or overreliance on specific data.
StatisticsHEP FitsData Visualization →
Scientific results are only valuable if they can be effectively communicated. Throughout my research, I've emphasized clear data visualization to convey key findings in presentations, papers, and collaborations.
Using tools like Mathematica, ManeParse, and CERN's ROOT, I've created visualizations that reveal model behavior, compare datasets, and highlight statistical significance in an accessible and digestible way.
D3.jsPythonFantômas4QCD
Our team set out to create a new, flexible way to model parton distribution functions (PDFs) – the statistical profiles that show how a particle’s momentum is shared among its quarks and gluons.
We applied Bézier curves in our PDF modeling, giving our solution the clarity of a simple mathematical formula and the adaptability of a neural network – a unique combination not seen in previous models.
The Fantômas module we built quantifies uncertainties in the pion’s structure that earlier models could not measure, providing new insights for QCD analysis.
The xFitter program with the Fantômas module implemented is found here.
Bézier curves: From cars to physics
To better describe the internal structure of the pion (and other hadrons), we use Bézier curves, originally developed for car design.
Invented by Paul de Casteljau (Citroën, 1958) and Pierre Bézier (Renault, 1960), Bézier curves were first used to model smooth car body shapes using only a few control points. The Citroën DS (left) was one of the first cars designed using this technology, which later became foundational in computer-aided design (CAD).
Today, more than 60 years later, we apply the same mathematical framework in high-energy physics. Bézier curves allow us to construct smooth, continuous, and highly adaptable functions for modeling PDFs. Their interpretability and flexibility make them ideal for capturing model-dependent uncertainty in a physically meaningful way.
How we extract PDFs using Bézier curves
Since parton distribution functions (PDFs) are not directly observable, we must infer them from experimental data — specifically, measurements from fixed-target and collider experiments.
Our module was implemented into xFitter, an open-source framework for global QCD analyses. xFitter predicts theoretical models by using a gradient-descent algorithm (Minuit) to minimize the chi-squared (χ²) goodness-of-fit value to identify the best-fit Bézier parameterization for the given data.
By adjusting the number and location of control points, the module can scan a wide range of model possibilities, giving us a more complete picture of the theoretical uncertainties in PDF extraction.
Extracted PDFs from pion data
From the hundreds of solutions generated by the Fantômas module, we selected five good fits that captured the full range of observed features.
These were combined into a single model that incorporates both aleatoric (statistical) and epistemic (model-based) uncertainties — providing a more comprehensive understanding of the pion's structure.
The complete pion PDF set is located here.
Visualizing pion data
After identifying the best-fit models, we analyzed their statistical behavior and interrelationships.
L2 Sensitivity
Understanding how each dataset influences a model is critical, especially when combining multiple datasets, where conflicting signals can obscure important patterns. That’s why it’s essential to evaluate dataset impact even after a model has been built.
Recently, a new technique called L2 sensitivity was developed to measure how much each dataset influences a model’s fit (link to study). In essence, it checks how much the model’s error (χ² goodness-of-fit) changes when a PDF parameter is adjusted by one standard deviation.
However, the standard L2 sensitivity method only works if the dataset is already part of the model’s analysis, which means it can’t directly evaluate brand-new or external data.
To solve that problem, I developed a more flexible procedure that applies L2 sensitivity to any dataset, even if it’s not part of the original model. I integrated this solution into the open-source xFitter tool and validated it with real data (as documented in my publication at arXiv).
Files related to the L2 sensitivity extraction method can be found here.
Baseline Comparison
I verified my method by comparing its results to an earlier study on how datasets affect proton PDFs. My results closely matched the prior study’s findings, confirming that my approach works reliably.
There were a few minor differences because I defined the chi-squared metric slightly differently and handled heavy-quark data in another way. These technical details only caused small changes in the L2 sensitivity results (as a function of the parton’s momentum fraction x).
Exploring New Datasets
After proving the method, I applied the L2 sensitivity method to new datasets whose influence hadn’t been examined before. Traditionally, to evaluate a new dataset’s influence, you’d have to spend weeks integrating it into the model’s codebase.
In contrast, my approach can assess a dataset’s impact in just a few hours and requires no changes to the core code. This makes it possible to quickly screen new data for its relevance (or redundancy), greatly speeding up the analysis pipeline.
Data Visualization
During my PhD, I relied heavily on data visualization to turn complex results into clear, accessible insights. I often analyzed parton distribution functions (PDFs) to predict measurable outcomes, and I had to create clear, informative charts both for technical reports and for communicating with non-specialist stakeholders.
Below are selected examples showcasing how I visualized data from my studies. Most of these figures were produced using Mathematica with the ManeParse package or CERN’s ROOT framework, widely used in particle physics. You can find more information about ManeParse here and ROOT here.
Gallery
Dissertation
Doctoral Thesis
A Novel Approach to Model the Pion Structure Using Advanced Polynomial Functions
In the coming years, new experiments at the Electron-Ion Collider (EIC) and the Large Hadron Collider (LHC) will provide unprecedented insight into the building blocks of matter. To make the most of these opportunities, scientists must reduce uncertainties in the theoretical models that connect what we observe in experiments to what is happening inside particles like protons and pions. A critical part of these models involves describing how a particles momentum is shared among its internal components – quarks and gluons.
This work focuses on improving how we model the internal structure of the pion. We propose a new approach using smooth, flexible mathematical curves – called Bézier curves – to describe this structure without overly rigid assumptions. Integrating this method into commonly used analytical tools allows us to study how different modeling choices affect our results. As future experiments deliver more data, our approach will help uncover a broader understanding of the pions inner structure and the forces that hold nuclei together.
Full Thesis
The complete dissertation is archived in the SMU Institutional Repository. Includes the full text, appendices, and supplementary materials as submitted for the doctoral degree.
Hosted by: SMU Scholar · Southern Methodist University
Defense Slides
Presentation slides from the doctoral defense. Covers the core motivation, methodology, key results, and outlook of the dissertation research in a concise format.
Format: PDF · Defense presentation
Defense info
Defended: April 2025
Institution: Southern Methodist University
Advisor: Prof. Pavel Nadolsky, Michigan State University
Advisor: Prof. Fred Olness, Southern Methodist University
External: Prof. Aurore Courtoy, Universidad Nacional Autónoma de México
Member: Prof. Allison Deiana, Southern Methodist University
Member (Chair): Prof. Matthew Klein, Southern Methodist University
Selected Presentations
Fall 2024 Joint Meeting of the Texas American Physics Society, October 18, 2024: Bézier curve parameterization for pion PDF
CTEQ Spring meeting 2024, June 4, 2024: Recent work on L2 sensitivities and Bèzier PDF parametrizations in xFitter
xFitter External Meeting 2023, May 3, 2023: Bézier curves and pion PDFs with xFitter
DIS2023, March 29>/b>, 2023: Bézier curve parametrization for pion PDFs