UUIDs and Lineage Tracking

ASSYST workflows generate, relax, and perturb structures in successive steps. Keeping track of which structure came from which parent is essential for two reasons:

  1. Reproducibility — when a structure causes a problem (e.g. an exploding DFT run) we want to find the seed it grew from.

  2. Joining data — energies and forces are typically computed in a separate workflow. A stable identifier lets us merge those results back onto the original structures.

ASSYST does this with three keys stored in atoms.info:

key

meaning

uuid

unique id of this structure

seed

uuid of the original ancestor (set once, never changes)

lineage

list of all previous uuids, oldest first

This notebook walks through how each step in the workflow updates these.

Imports

from assyst.crystals import pyxtal
from assyst.perturbations import Rattle, Stretch
from assyst.relaxations import Relax
from assyst.calculators import Morse
/root/.local/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm

1. Generation

A freshly generated structure has no ancestors. Its uuid and seed are identical and the lineage is empty.

atoms = pyxtal(225, species=['Cu'], num_ions=[4])

print(f"uuid:    {atoms.info['uuid']}")
print(f"seed:    {atoms.info['seed']}")
print(f"lineage: {atoms.info.get('lineage', [])}")
uuid:    f6d3b4dd-af88-4bbb-81f0-bb7a55089926
seed:    f6d3b4dd-af88-4bbb-81f0-bb7a55089926
lineage: []

2. Relaxation

Relaxing a structure produces a new Atoms object: it gets a new uuid, the previous one moves into lineage, and the seed is preserved.

relax = Relax(max_steps=5)
atoms.calc = Morse().get_calculator()
relaxed = relax.relax(atoms)

print(f"uuid:    {relaxed.info['uuid']}")
print(f"seed:    {relaxed.info['seed']}")
print(f"lineage: {relaxed.info['lineage']}")
uuid:    ecad2002-c6c3-4ba3-81b7-278f5504a31d
seed:    f6d3b4dd-af88-4bbb-81f0-bb7a55089926
lineage: ['f6d3b4dd-af88-4bbb-81f0-bb7a55089926']

3. Perturbation

Perturbations work the same way: the lineage grows, the seed stays put.

rattled = Rattle(sigma=0.05)(relaxed.copy())

print(f"uuid:    {rattled.info['uuid']}")
print(f"seed:    {rattled.info['seed']}")
print(f"lineage: {rattled.info['lineage']}")
uuid:    a8eb4310-65a5-4311-9de7-8763626133b8
seed:    f6d3b4dd-af88-4bbb-81f0-bb7a55089926
lineage: ['f6d3b4dd-af88-4bbb-81f0-bb7a55089926', 'ecad2002-c6c3-4ba3-81b7-278f5504a31d']

4. Chained modifications

Stacking another perturbation on top extends the lineage further. The ancestry of any structure is fully recoverable from the lineage list.

final = Stretch(hydro=0.05, shear=0.05)(rattled.copy())

print(f"uuid:    {final.info['uuid']}")
print(f"seed:    {final.info['seed']}")
print(f"lineage: {final.info['lineage']}")
uuid:    edcc557b-efea-45bd-bde2-8c37a2de4562
seed:    f6d3b4dd-af88-4bbb-81f0-bb7a55089926
lineage: ['f6d3b4dd-af88-4bbb-81f0-bb7a55089926', 'ecad2002-c6c3-4ba3-81b7-278f5504a31d', 'a8eb4310-65a5-4311-9de7-8763626133b8']

Summary

Tracing back from final we can reconstruct the full history:

for i, u in enumerate([final.info['seed'], *final.info['lineage'][1:], final.info['uuid']]):
    role = 'seed' if i == 0 else ('current' if u == final.info['uuid'] else f'step {i}')
    print(f"{role:>8}: {u}")
seed: f6d3b4dd-af88-4bbb-81f0-bb7a55089926
step 1: ecad2002-c6c3-4ba3-81b7-278f5504a31d
step 2: a8eb4310-65a5-4311-9de7-8763626133b8
current: edcc557b-efea-45bd-bde2-8c37a2de4562