Bulk Hetero Junction Range Expander

This collection of modules allows the expansion of site-energies (HOMO, LUMO) for 2-component bulk-hetero junctions. The energies can then be used as offsets to EA-IP values in LightForge.

The documentation is split into two sections: Training of the model to do the actual expansion and predictions carried out with the model. While the training of model requires just a crafted Deposit morphology as input, the prediction step requires various steps to obtain a suitables CG morphology.

Training of the Model

The training of the model is a self-contained uncharged_equilibration QuantumPatch calculation. The training can be carried out on multiple different morphologies at the same time independently as the generation of the histograms is done at prediction execution time.

QuantumPatch range expansion

Start by generating an empty settings file by invoking


Start by setting up QuantumPatch as you would for a normal uncharged_equilibration QP calculation, i.e. set the System.Core settings, the number of iterations and DFT engines including fallbacks. More care has to be taken in regards to the System.Core: Molecules in the system core should be exposed to finite size effects of the morphology at the edge (the charge cloud radius is usually set to 60 Angström). If you remove the Partial Charge Cutoff / Environment Radius to 45 Angström, you can also simulate a larger inner core. See the following image for an illustration:

If this is not respected, the observable vectors of the prediction observable will include "vacuum" and therefore be inaccurate. This will always ruin your prediction.

The QuantumPatch settings now contain the block:

  enable: true
  neighour_radius: 20.0
  histogram_width: 6
  debug: false

If you set MachineLearning.enable = true in a normal uncharged_equilibration calculation, QP will afterwards generate files in the MLAnalysis folder. You also need to set save_multipoles: true in the DFT Engines you are using in the final step of the calculation (you can also leave it enabled in every step, the overhead is very low). histogram_width is the size of the histograms of the first nearst neighbours feature and neighbour_radius is the size of the shell neighbours are looked for. An exemplary DFT Engine input would be:

DFTBplus 1:
  fallback: Turbomole 1
  engine: DFTBplus
  threads: 1
  charge_model: mulliken
  skfset: 3ob-3-1
  save_multipoles: true
Turbomole 1:
  fallback: Turbomole 1 Fallback
  engine: Turbomole
  basis: STO-3G
  functional: BP86
  threads: 1
  memory: 1500
  charge_model: ESP
  save_multipoles: true

Make sure that the fallback is also setup for multipole saving in case one is configured. Start the QP calculation as usual.

QP Output files

Check the folder MLAnalysis for all important plots in regards to machine learning.

The two most important files in this folder will be called machine_learning_MOLTYPE.txt. This contains the raw data of homo, lumo and observables. It's required to predict site energies for the CG morphology. If your initial morphology has a distinguished z-axis, i.e. if your moltype gradient points along the z-axis, the mean_.png an sigma_.png files will contain a meaningful error estimate for the sigma in z-slices. If your z-axis is not distinguished (completely disordered BHJ), the errors are not meaningful, because the histogram among a z-slice contains a subsystem with its own concentration gradient. In this case the estimated errors will be higher than the actual errors.

The other plots do not have to be reviewed. They are plots from the histograms for each binning observable and are only interesting, in case the training fails. Once you did QP calculations for all interesting systems, collect all machine_learning_MOLTYPE.txt files and make one cocatenated file for each unique MOLTYPE.

Example calculation

For verification of the training process an example training containing the QuantumPatch results of the previous steps are found here.. If you want to rerun all of the QP steps, delete quantumpatch_runtime_files before starting QuantumPatch. In this case you also require dummy parameters for Boron, which are found here. These parameters are not realistic and lead to overestimated disorder. We only recommend using them to test the functionality of this module. The systems were chosen, because of a very large disorder on the one hand (BDIP) and a vanishing disorder for C60 on the other hand making this a hard training scenario.

Example calculation outputs

For the example calculation both in the interface and in the stack case, disorders for C60 and BDIP are roughly in the vicinity of 15%, which is expected to be the smaller than statistical error of the real disorder in the z-slices. Exemplary output for the stack disorders is shown here:


Required inputs:

  • Multiple Deposit Morphologies
    • Homogeneous morphologies for each of the two phases
    • Heterogeneous morpholog(ies,y) for the mixed phase
  • CGkMC / CGMC morphology to be expanded

Preparing the IBI morphology

To prepare the input files for the coming steps, use the tool MolecularTools/PrepareIBI.py in the QuantumPatch folder. For each of the Homogeneous and the Heterogeneous morphologies (all-atom), run:

MolecularTools/PrepareIBI.py structurePBC.cml

The first time this script is executed it will generate multiple files:

  • cog_structurePBC.dat This file contains a center of geometry structure to be fed into IBI.
  • lf_cog_structurePBC.dat The same file as before with another column for the molecule type
  • mapping_structurePBC.dat A file required for future Range Expander calculations. It will be referenced in the QP RangeExpander documentation. The file format of this file is: 0: "7316113117c9506276dcc9c1d71ee076" 1: "0ea4d81ac970d3f4fdbbe46acd91a041"
  • moltypes_structurePBC.dat List of molecule types in the same order as the molecules in cog_structurePBC.dat
  • orientationanalysisspec_structurePBC.dat Specification for an orientation analysis

and additionally output to be fed into IBI manually:

Structure data for IBI
density in particles/nm3: 1.490164
dimensions:  [[23.99720166666667, 0.0, 0.0], [0.0, 23.992671768707478, 0.0], [0.0, 0.0, 20.979702414965985]]

For molecules without internal orientation (C60) these outputs are already sufficient to be fed into IBI. For molecules with internal orientation PrepareIBI has to be rerun:

  • Edit orientationanalysisspec_structurePBC.dat and put for each species of molecules two distinct atom ids. The orientation of the molecule will be sampled in respect to this axis. For Pentacene for example you can chose the long axis of the molecule. If only one of two molecules has an internal orientation, leave the other entry unchanged (i.e. both atom ids at -1).
  • Rerun MolecularTools/PrepareIBI.py structurePBC.cml to generate the additional file containing the orientation vector of each molecule. For molecule species, where no orientation was specified a 0-vector will be included.


Run QuantumPatch/MLTools/PredictCG.py completely without arguments. A new settings file will be generated. Inside this settings file you have to specify the IBI output morphology, the mapping_file and the two machine_learning_MOLTYPE.txt in the order of the ids in the mapping_file. Additionaly you can customize the box cutout, which by default is set to 30 Angström. It's important for the prediction not to have edge effects, because the observable feature vector would be completely wrong at the edge.

After running MLTools/PredictCG.py again with the correct settings file, two new files will be output:

output_cog.xyz contains the cut morphology output_homolumo.dat contains predicted homo, lumo and gap to be fed into Lightforge.

An exemplary run is found in -this zip file.


The LightForge documentation is continued here

The results of the search are