Research
This page contains some summaries of the projects I’ve worked on. Click the link in the list below to be taken to the project summary.
Contents
ruckus
Kernel embedding networks with Python and scikit-learn
ruckus
is my latest project—a Python package for working with networks of reproducing
kernel Hilbert spaces, for use in machine learning, time-series analysis,
and dynamical systems modeling. This package is inspired by and intended to further my own work
on the foundations of RKHS methods for time-series prediction (Loomis & Crutchfield
2021).
Reproducing kernel Hilbert spaces (RKHS’s, or, as I say it, “ruckuses”) form the mathematical bedrock of numerous machine learning techniques, from support vector machines and Gaussian processes to neural tangent kernels and random feature embeddings.
ruckus
provides specialized objects for defining, fitting and applying RKHS’s to data,
as well as the necessary tools to build intricately-designed deep and convolutional RKHS networks.
It also supports kernel distributional embeddings and efficient sampling of forecasts via kernel herding.
This project is a work in progress. Stay tuned for more updates!
Packages
ruckus
Carbon Chaos!
Nonequilibrium Methods in Emissions Analysis
The “carbon footprint” has become a fairly ubiquitous character in discussions of climate change and economic policy. The methods for computing footprints originate in the field of input-output analysis, which traces through networks of economic dependency to link up consumption activities with the production activities—and resulting emissions—which sustain them. This allows not only calculating how much carbon is necessary to support someone’s lifestyle, but also determines the geographic distribution of emissions, linking up the consumption practices of one region with pollution in another.
Input-output analysis rests on a number of assumptions. We analyzed the consequences of these assumptions in (Loomis, Cooper & Crutchfield 2021), with data from the Global Trade Aggregation Project. The upshot is that whether a nation’s footprint is located within domestic borders or beyond is primarily determined by that nation’s “carbon intensity”: the amount of \(\mathrm{CO}_2\) emitted per dollar of income. This arises as a theoretical consequence of input-output assumptions, with majorization—a tool from nonequilibrium thermodynamics—playing a central role. These results give a better understanding of how input-output models derive their results, but consequently casts doubt on their ability to properly test hypotheses of unequal exchange of embodied pollution between regions.
Because this effect is purely mathematical in nature, it can be observed for any resource one tracks through input-output analysis. The following graph shows the relationship between national average wage (the analagous “intensity” for labor) and the ratio of produced to consumed embodied labor, for each nation. The correlation is highly suggestive. However, it is chiefly a mathematical effect. Input-output analysis predicts that high-wage nations import more embodied labor, but it does not provide evidence, an important distinction elucidated by our thermodynamic analysis.
One of the more questionable assumptions of input-output analysis, which is crucial to the results above, is its inherent linearity. Further work in this area will seek to offer alternatives to input-output analysis, performing nonlinear dynamical analysis of policy impacts to determine the more complex relationships between consumption, production and emissions.
Articles
Loomis et al. Nonequilibrium thermodynamics in measuring carbon footprints. Submitted.
Packages
github.com/samlikesphysics/netacam_code.git
Efficient Quantum Computing
Quantum memory compression and energy savings
Computers may be thought of as engines for transforming free energy into waste heat and mathematical work.
Einstein proved that matter and energy could be interchanged; some time later, Rolf Landauer proved that energy and information could be interchanged as well. More specifically, in his resolution of the paradox of Maxwell’s Demon, Landauer demonstratedR. Landauer. Irreversibility and heat generation in the computing process. 1961. that the erasure of a bit of information required expending a minimum amount energy, \(kT\ln 2\).Here \(k\) is the Boltzmann constant and \(T\) is the ambient temperature of the computer. Bit erasure is a key tool in computational tasks and so Landauer’s principle placed fundamental energy costs on computation itself.
We’ve contributed to this line of work by studying a particular computational task: simulating a specified pattern, written onto an empty tape. Though this may sound simple, the pattern may be arbitrarily complicated: anything from random data to Shakespeare. This makes simulation an extremely deep arena for studying the costs of computation.
The most efficient machine for generating a pattern using only classical physics is called an \(\epsilon\)-machine. We examined a recent quantum extension of the \(\epsilon\)-machine, called the \(q\)-machine, and put it to the test: under what conditions do \(q\)-machines provide crucial advantages in memory and energy costs?
We provided the first proofs that \(q\)-machines are always able to improve on \(\epsilon\)-machines in memory costs (Loomis & Crutchfield 2019). However, this work did not initially address the thermodynamic angle, so it remained unclear whether this improvement in memory came at a cost. The answer required distinguishing between two classes of \(\epsilon\)-machine, called the forward and reverse varieties. We discovered in two companion papersThermal efficiency of quantum memory compression and Thermodynamically efficient local computation that while forward \(\epsilon\)-machines are less efficient both in memory and energy costs, reverse \(\epsilon\)-machines exhibit a tradeoff: they leverage their additional memory costs to save on energy costs. Vice-versa, quantum \(q\)-machines exhibit higher energy costs in order to save on memory.
Takeaway: Quantum can save space and time—but when it comes to generating patterns, classical is still the most energy-efficient way to go.
Publications
Loomis & Crutchfield. Thermodynamically efficient local computation. 2020
Loomis & Crutchfield. Thermal efficiency of quantum memory compression. 2020
Loomis et al. Optimizing quantum models of classical channels. 2020
Loomis & Crutchfield. Strong and weak optimizations in classical and quantum models of stochastic processes. 2019
Aghamohammadi et al. Extreme quantum memory advantage for rare-event sampling. 2018
stoclust
Stochastic clustering in Python
stoclust
is a package of modularized methods for stochastic and ensemble
clustering techniques. By modular, I mean that there are few methods in this
package which act as a single pipeline for clustering a dataset–––rather, the
methods each form a unit of what might be a larger clustering routine.
These modular units are designed to be compatible with general clustering
methods from other packages, like scipy.clustering
or sklearn.cluster
. However,
we also provide specific methods for implementing clustering algorithms whose
underlying mathematics is rooted in stochastic analysis and dynamics.
Additionally, one can add a stochastic twist to any clustering method by using
ensemble clustering, which uses randomness to probe the stability and robustness
of clustering results.
Packages
stoclust
Rome wasn’t built in a day. And that would be crazy, because this website is far less complicated and also wasn’t built in a day. I’ll add more project details soon.