biochar_generator

Convenience functions

biochar.biochar_generator.generate_biochar(target_num_carbons=50, H_C_ratio=None, O_C_ratio=None, aromaticity_percent=None, functional_groups=None, defect_fraction=0.0, max_ether_span=5, num_pyridinic=0, num_pyrrolic=0, num_graphitic=0, charge_method='opls', temperature=None, feedstock=None, output_directory='.', basename='biochar', molecule_name='BC', seed=None)[source]

Convenience function to generate and export biochar in one call.

Parameters:
  • target_num_carbons – Target number of carbon atoms.

  • H_C_ratio – Target hydrogen-to-carbon ratio.

  • O_C_ratio – Target oxygen-to-carbon ratio. Used to determine total oxygen when functional_groups is None.

  • aromaticity_percent – Target aromaticity percentage.

  • functional_groups – Dict mapping functional group name → exact count, e.g. {"phenolic": 3, "carboxyl": 1}.

    Supported groups:

    • phenolic — aromatic C–OH (1 O per group)

    • hydroxyl — same as phenolic for pure aromatic PAH (1 O)

    • carboxyl — aromatic C–C(=O)(OH) (2 O per group)

    • ether — aromatic C–O–C bridge (1 O per group)

    • carbonyl — not supported; substituted with phenolic

    • quinone — not supported; substituted with phenolic

    • lactone — not supported; substituted with phenolic

    If None (default), the total oxygen count is derived from O_C_ratio and placed as phenolic (–OH) groups.

  • defect_fraction – Probability [0, 1) that each ring added during skeleton growth is a 5-membered (pentagon) ring rather than a hexagon. 0.0 (default) = pure hexagonal PAH. Values ~0.1–0.2 introduce realistic topological disorder.

  • max_ether_span – Maximum number of C–C bonds between the two ring carbons bridged by each ether oxygen. Controls the ring size of the C–O–C bridge (ring size = max_ether_span + 2). Default 3 (5-membered furan/benzofuran-like ring — always stays flat). Use 4 for pyran/chromene-like (6-membered) or 5 for 7-membered; larger values risk cross-sheet bridges that fold the molecule.

  • output_directory – Output directory for GROMACS files.

  • basename – Base filename for output files.

  • molecule_name – Residue name (max 5 chars). Suggested: BC400, BC600, BCH05, BCO10.

  • seed – Random seed for reproducibility.

Return type:

Tuple[Mol, ndarray, Path, Path, Path]

Returns:

(molecule, coordinates, gro_path, top_path, itp_path)

Parameters:
  • target_num_carbons (int)

  • H_C_ratio (float | None)

  • O_C_ratio (float | None)

  • aromaticity_percent (float | None)

  • functional_groups (Dict[str, int] | None)

  • defect_fraction (float)

  • max_ether_span (int)

  • num_pyridinic (int)

  • num_pyrrolic (int)

  • num_graphitic (int)

  • charge_method (str)

  • temperature (float | None)

  • feedstock (str | None)

  • output_directory (str)

  • basename (str)

  • molecule_name (str)

  • seed (int | None)

biochar.biochar_generator.generate_surface(target_num_carbons=50, H_C_ratio=0.3, O_C_ratio=0.05, functional_groups=None, defect_fraction=0.0, pore_diameter=10.0, num_sheets=2, pore_type='slit', max_attempts=500, min_separation=3.0, sheet_overrides=None, output_directory='.', basename='surface', system_name='SLIT', seed=None, strict=True)[source]

Generate a slit-pore surface system and export to GROMACS files.

Creates num_sheets parallel graphene-like sheets separated by pore_diameter Ångströms, applies functional groups to each sheet, and writes GROMACS-ready .gro / .top / .itp files.

Parameters:
  • target_num_carbons – Number of carbon atoms per sheet.

  • H_C_ratio – Target H/C ratio for each sheet.

  • O_C_ratio – Target O/C ratio for each sheet (used when functional_groups is None).

  • functional_groups – Functional groups applied to every sheet, e.g. {'phenolic': 2, 'ether': 1}. Overridden per-sheet if sheet_overrides is provided.

  • pore_diameter – Gap between sheet inner van-der-Waals surfaces, in Ångströms.

  • num_sheets – Number of parallel sheets (default 2 → one slit pore).

  • pore_type"slit" (parallel stacked sheets) or "amorphous" (random rigid-body packing with steric rejection).

  • max_attempts – Max random placement attempts per sheet for pore_type="amorphous" before raising RuntimeError.

  • min_separation – Minimum inter-sheet atom-atom distance (Å) for pore_type="amorphous".

  • sheet_overrides – List of per-sheet config dicts (length must equal num_sheets). Accepted keys: target_num_carbons, H_C_ratio, O_C_ratio, functional_groups, aromaticity_percent, seed. If None, all sheets are chemically identical.

  • output_directory – Directory for output files.

  • basename – Base filename for .gro/.top/.itp files.

  • system_name – Name written to the [ system ] section in .top.

  • seed – Random seed for reproducibility.

Return type:

Tuple[list, Path, Path, list]

Returns:

(sheets, gro_path, top_path, itp_paths)

  • sheetsList[SheetResult] with mol, coords, composition.

  • gro_path, top_pathpathlib.Path objects.

  • itp_paths — list of pathlib.Path (one per unique sheet type).

Parameters:

Examples:

# Simple slit pore — two identical sheets, 10 Å pore
sheets, gro, top, itps = generate_surface(
    target_num_carbons=40,
    functional_groups={'phenolic': 2, 'ether': 1},
    pore_diameter=10.0,
)

# Asymmetric pore — different chemistry on each wall
sheets, gro, top, itps = generate_surface(
    pore_diameter=8.0,
    sheet_overrides=[
        {'functional_groups': {'phenolic': 3}, 'target_num_carbons': 40},
        {'functional_groups': {'carboxyl': 2}, 'target_num_carbons': 50},
    ],
)
biochar.biochar_generator.generate_biochar_series(configurations, output_directory='.', create_combined_top=True, verbose=True)[source]

Generate multiple biochar structures for mixed simulations.

This function is ideal for creating temperature series, composition series, or mixed biochar systems for GROMACS simulations.

Parameters:
  • configurations – List of configuration dictionaries. Each dict should contain: - ‘molecule_name’ (str, required): Residue name (max 5 chars, e.g., ‘BC400’) - ‘target_num_carbons’ (int, optional): Default 50 - ‘H_C_ratio’ (float, optional): Default 0.5 - ‘O_C_ratio’ (float, optional): Default 0.1 - ‘aromaticity_percent’ (float, optional): Default 90.0 - ‘seed’ (int, optional): For reproducibility - ‘functional_groups’ (dict, optional): e.g. {“phenolic”: 3, “carboxyl”: 1}

  • output_directory – Output directory for all files

  • create_combined_top – If True, generate a combined topology for all structures

  • verbose – If True, print progress information

Return type:

Dict[str, Tuple[Path, Path, Path]]

Returns:

Dictionary mapping molecule_name -> (gro_path, top_path, itp_path)

Parameters:

Example

>>> configs = [
...     {'molecule_name': 'BC400', 'H_C_ratio': 0.65, 'O_C_ratio': 0.20},
...     {'molecule_name': 'BC600', 'H_C_ratio': 0.55, 'O_C_ratio': 0.12},
...     {'molecule_name': 'BC800', 'H_C_ratio': 0.40, 'O_C_ratio': 0.05},
... ]
>>> results = generate_biochar_series(configs, output_directory='output')

Configuration

class biochar.biochar_generator.GeneratorConfig(target_num_carbons=50, size_tolerance=0.1, H_C_ratio=None, H_C_tolerance=0.1, O_C_ratio=None, O_C_tolerance=0.1, aromaticity_percent=None, aromaticity_tolerance=5.0, functional_groups=None, periodic_box=False, box_size=None, molecule_name='BC', seed=None, defect_fraction=0.0, max_ether_span=3, num_pyridinic=0, num_pyrrolic=0, num_graphitic=0, charge_method='opls', temperature=None, feedstock=None, strict=True)[source]

Bases: object

Configuration for BiocharGenerator.

All parameters are optional — defaults produce a 50-carbon, mostly-aromatic biochar with a moderate hydrogen content and no oxygen.

Parameters:
  • target_num_carbons (int)

  • size_tolerance (float)

  • H_C_ratio (float | None)

  • H_C_tolerance (float)

  • O_C_ratio (float | None)

  • O_C_tolerance (float)

  • aromaticity_percent (float | None)

  • aromaticity_tolerance (float)

  • functional_groups (Dict[str, int] | None)

  • periodic_box (bool)

  • box_size (ndarray | None)

  • molecule_name (str)

  • seed (int | None)

  • defect_fraction (float)

  • max_ether_span (int)

  • num_pyridinic (int)

  • num_pyrrolic (int)

  • num_graphitic (int)

  • charge_method (str)

  • temperature (float | None)

  • feedstock (str | None)

  • strict (bool)

target_num_carbons

Target number of carbon atoms in the skeleton. The generator grows the PAH graph until it reaches a count within size_tolerance of this value. Minimum 6 (benzene).

size_tolerance

Fractional tolerance on target_num_carbons (e.g. 0.10 = ±10 %).

H_C_ratio

Target hydrogen-to-carbon atomic ratio.

H_C_tolerance

Fractional tolerance on H_C_ratio.

O_C_ratio

Target oxygen-to-carbon atomic ratio. Ignored when functional_groups is not None.

O_C_tolerance

Fractional tolerance on O_C_ratio.

aromaticity_percent

Target fraction of carbon atoms that are aromatic, as a percentage (0–100).

aromaticity_tolerance

Absolute tolerance on aromaticity_percent in percentage-point units.

functional_groups

Explicit dict mapping functional group name to exact placement count, e.g. {"phenolic": 3, "carboxyl": 1}. Supported groups: phenolic, hydroxyl, carboxyl, ether. If None, total oxygen is derived from O_C_ratio and placed as phenolic groups.

periodic_box

If True, include periodic boundary box vectors in the exported .gro file.

box_size

Explicit box size in nm as a 3-element array. Used only when periodic_box is True and box_size is not None.

molecule_name

Residue name written to .gro / .itp (max 5 characters — GROMACS hard limit). Suggested naming: BC400, BC600, BC800 (pyrolysis temperature series).

seed

Integer random seed for reproducibility. None = random.

defect_fraction

Probability [0, 1) that each ring added during skeleton growth is a 5-membered pentagon rather than a hexagon. 0.0 = pure hexagonal PAH. Values 0.05–0.20 introduce realistic topological disorder seen in low-temperature biochar.

max_ether_span

Maximum C–C shortest-path distance (in bonds) between the two ring carbons bridged by each ether oxygen. Controls the ring size of the C–O–C bridge (ring size = max_ether_span + 2). Default 3 → 5-membered furan-like ring (always geometrically flat). Larger values may fold the aromatic sheet.

Class API

class biochar.biochar_generator.BiocharGenerator(config=None)[source]

Bases: object

Generate a single biochar molecule and export it to GROMACS files.

The generator runs a five-step pipeline:

  1. Carbon skeleton — grows a PAH graph to the requested carbon count using hexagonal ring expansion (or defective with pentagons).

  2. Heteroatom assignment — places oxygen-containing functional groups then fills remaining valences with hydrogen.

  3. 3D coordinates — embeds the molecule in 3D; flattens large sheets via the hex-lattice path and optimises O–H hydrogen positions.

  4. OPLS-AA typing — assigns atom types and partial charges.

  5. Validation — checks composition ratios and geometry.

Use generate_biochar() for a one-call convenience wrapper.

Examples:

config = GeneratorConfig(target_num_carbons=80, H_C_ratio=0.4,
                         O_C_ratio=0.1, seed=42)
gen = BiocharGenerator(config)
mol, coords, composition = gen.generate()
gro, top, itp = gen.export_gromacs(output_directory="output")
Parameters:

config (GeneratorConfig | None)

generate()[source]

Run the full generation pipeline and return the molecular structure.

Return type:

Tuple[Mol, ndarray, CompositionResult]

Returns:

Tuple of

  • mol (rdkit.Chem.Mol) — molecule with 3-D conformer and OPLS-AA atom types assigned.

  • coords (numpy.ndarray, shape (N, 3)) — atomic coordinates in Ångströms.

  • composition (CompositionInfo) — atom counts, H/C and O/C ratios, and functional-group census.

Raises:

RuntimeError – If carbon skeleton growth fails after retries.

export_gromacs(output_directory='.', basename='biochar')[source]

Write GROMACS structure and topology files.

Must be called after generate().

Parameters:
  • output_directory – Directory in which to write output files. Created if it does not exist.

  • basename – Stem for output filenames (e.g. "bc400"bc400.gro, bc400.top, bc400.itp).

Return type:

Tuple[Path, Path, Path]

Returns:

Tuple of Path objects (gro_path, top_path, itp_path).

Raises:

RuntimeError – If generate() has not been called yet.

Parameters:
  • output_directory (str)

  • basename (str)

print_summary()[source]

Print summary of generated structure.