QAOA Tutorial: Cost Hamiltonian to Python Example

A practical QAOA tutorial that shows how to map Max-Cut to a cost Hamiltonian and build a working Python example with realistic planning guidance.

QAOA can feel abstract until you map the math to a circuit, a classical optimizer, and a concrete optimization problem. This guide gives you that bridge. You will learn what the Quantum Approximate Optimization Algorithm is doing, how to turn a small graph problem into a cost Hamiltonian, how to estimate whether a QAOA experiment is worth running on a simulator or hardware, and how to build a working Python example you can adapt later. The emphasis is practical: parameter choices, depth tradeoffs, measurement budget, and the small decisions that usually determine whether a first QAOA project teaches you something useful or just burns time.

Overview

QAOA, short for the Quantum Approximate Optimization Algorithm, is a variational algorithm designed for combinatorial optimization. In plain terms, you start with a problem such as Max-Cut, encode it into a quantum cost function, build a parameterized circuit, and let a classical optimizer search for angles that produce low-energy or high-value bitstrings.

For developers, the useful mental model is this:

The problem layer defines what counts as a good solution.
The quantum circuit prepares candidate solutions and reshapes their probabilities.
The classical optimizer updates circuit parameters based on measured results.

That loop is the heart of variational optimization in quantum computing. QAOA is often introduced through Max-Cut because the mapping is clean and the resulting circuits are easy to inspect.

A minimal QAOA workflow looks like this:

Pick an optimization problem.
Express the objective as a cost Hamiltonian.
Choose a circuit depth, usually written as p.
Build alternating cost and mixer layers.
Measure bitstrings many times.
Use a classical optimizer to update parameters.
Decode the best measured bitstring back into a problem solution.

This article uses Max-Cut as the running example because it is the fastest way to understand QAOA explained in code rather than only in notation.

If you are still choosing a software stack, our Qiskit vs Cirq vs PennyLane for Beginners guide can help you decide where to start. For this tutorial, Qiskit is a natural fit because it gives you straightforward circuit construction and simulator access for a clean qaoa python example.

How to estimate

Before you write code, it helps to estimate three things: circuit size, optimization effort, and measurement cost. This matters because QAOA performance depends as much on engineering choices as on the algorithm itself.

1. Estimate circuit width

For graph-based Max-Cut, the number of qubits is usually the number of vertices. A graph with 6 nodes needs 6 qubits. That sounds simple, but it sets the baseline for simulator memory, transpilation complexity, and hardware fit.

Rule of thumb: start with 4 to 10 qubits for learning, debugging, and comparing parameter settings.

2. Estimate circuit depth

QAOA depth is controlled by p, the number of alternating cost and mixer layers. Larger p can increase expressiveness, but it also increases two-qubit gate count and usually makes optimization harder.

A simple estimate for Max-Cut is:

depth ≈ p × (cost-layer gates + mixer-layer gates)

For each edge in the graph, the cost layer typically contributes a ZZ-style interaction. For each qubit, the mixer layer contributes an X-rotation. So a rough planning formula is:

two-qubit interactions per evaluation ≈ p × number_of_edges

single-qubit mixer rotations per evaluation ≈ p × number_of_vertices

This is not a hardware-native gate count, but it is good enough to compare problem instances before committing to a run.

3. Estimate optimization workload

The classical optimizer evaluates the circuit many times. Each evaluation uses one parameter set, runs the circuit repeatedly, and computes an objective estimate from the sampled measurements.

For standard QAOA with depth p, you usually optimize:

2p parameters = [γ1, β1, γ2, β2, ..., γp, βp]

The real cost is not just the number of parameters. It is roughly:

total quantum work ≈ optimizer_steps × shots_per_step × circuits_per_step

Depending on the optimizer, circuits_per_step may be one, several, or many. Gradient-free methods can be simple to use, but some need more evaluations. Gradient-based methods may converge faster in some settings, but their implementation details vary by SDK and backend.

4. Estimate whether hardware is worth trying

For a first pass, ask four questions:

How many qubits does the problem need?
How many entangling gates does one QAOA layer require?
How many shots will you need for stable objective estimates?
Will noise overwhelm improvements from raising p?

If you cannot answer those yet, stay on a simulator. That is usually the right move for a first qaoa tutorial workflow. Our quantum circuit simulator comparison is useful when you want to compare local and managed options.

Once you are ready for cloud platforms, see the platform-specific setup guides for IBM Quantum, AWS Braket, and Azure Quantum.

Inputs and assumptions

To keep the example concrete, we will assume a small unweighted Max-Cut instance. Max-Cut asks you to split graph vertices into two groups so that as many edges as possible cross the partition.

The optimization problem

Suppose the graph has vertices 0, 1, 2, 3 and edges:

[(0,1), (1,2), (2,3), (3,0), (0,2)]

This graph is small enough to reason about by hand and rich enough to show QAOA structure.

The cost Hamiltonian intuition

For Max-Cut, each edge contributes when its two endpoint bits are different. In the computational basis, a common encoding rewards bitstrings where connected nodes land on opposite sides of the cut.

You do not need the full derivation to implement it. For each edge (i,j), the usual QAOA cost term is based on a ZZ interaction. The total cost Hamiltonian is the sum of those edge terms.

What matters in practice is that:

Edges become pairwise interactions in the circuit.
The graph structure directly determines the cost layer.
More edges usually mean more entangling operations.

The mixer Hamiltonian intuition

The mixer is usually a sum of X operators, one per qubit. Its job is to move probability mass between candidate bitstrings so the algorithm can explore the search space.

In code, the mixer layer is often just an RX rotation applied to every qubit with parameter 2β.

Parameter assumptions

For a first run, make conservative assumptions:

Depth p: start with 1 or 2.
Shots: use a fixed shot count large enough to reduce obvious sampling noise.
Optimizer: begin with a derivative-free optimizer that is easy to debug.
Initialization: start from small random angles or simple seeded values.

Developers often lose time by increasing p too early. A clean p = 1 experiment that returns plausible cuts is more useful than a tangled p = 4 run with no diagnostic visibility.

A working Qiskit example

The code below is intentionally explicit. It avoids too much framework magic so you can see how the cost Hamiltonian becomes gates.

import numpy as np
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator
from scipy.optimize import minimize

# Small Max-Cut graph
n_qubits = 4
edges = [(0, 1), (1, 2), (2, 3), (3, 0), (0, 2)]
shots = 2048
simulator = AerSimulator()


def maxcut_value(bitstring, edges):
    bits = [int(b) for b in bitstring]
    value = 0
    for i, j in edges:
        if bits[i] != bits[j]:
            value += 1
    return value


def qaoa_circuit(gammas, betas, n_qubits, edges):
    qc = QuantumCircuit(n_qubits, n_qubits)

    # Initial uniform superposition
    for q in range(n_qubits):
        qc.h(q)

    p = len(gammas)
    for layer in range(p):
        gamma = gammas[layer]
        beta = betas[layer]

        # Cost layer for Max-Cut using ZZ interactions
        for i, j in edges:
            qc.cx(i, j)
            qc.rz(2 * gamma, j)
            qc.cx(i, j)

        # Mixer layer
        for q in range(n_qubits):
            qc.rx(2 * beta, q)

    qc.measure(range(n_qubits), range(n_qubits))
    return qc


def expected_cost(params, p, n_qubits, edges, shots):
    gammas = params[:p]
    betas = params[p:]
    qc = qaoa_circuit(gammas, betas, n_qubits, edges)
    result = simulator.run(qc, shots=shots).result()
    counts = result.get_counts()

    total = 0
    for bitstring, count in counts.items():
        # Reverse if your bit ordering convention requires it
        value = maxcut_value(bitstring[::-1], edges)
        total += value * count

    mean_value = total / shots

    # Minimize negative expected cut to maximize the cut
    return -mean_value


p = 1
initial_params = np.array([0.5, 0.5])  # [gamma_1, beta_1]

result = minimize(
    expected_cost,
    initial_params,
    args=(p, n_qubits, edges, shots),
    method='COBYLA'
)

best_params = result.x
best_gamma = best_params[:p]
best_beta = best_params[p:]

final_qc = qaoa_circuit(best_gamma, best_beta, n_qubits, edges)
final_result = simulator.run(final_qc, shots=shots).result()
final_counts = final_result.get_counts()

ranked = sorted(
    final_counts.items(),
    key=lambda item: item[1],
    reverse=True
)

print('Best parameters:', best_params)
print('Top bitstrings:')
for bitstring, count in ranked[:5]:
    print(bitstring, count, 'cut=', maxcut_value(bitstring[::-1], edges))

This example is intentionally minimal, but it already demonstrates the core logic:

Graph edges define the cost layer.
Alternating cost and mixer layers define the ansatz.
A classical optimizer searches for better angles.
Sampled bitstrings are scored using the original optimization objective.

If you are setting up the environment from scratch, our Qiskit installation guide covers a clean Python setup path.

Worked examples

Here is how to think about QAOA outcomes in a way that is useful for developers, not just for papers.

Example 1: A first-pass simulator run

Suppose you run the 4-node graph above with p = 1. You are not trying to prove quantum advantage. You are checking for three signs of life:

The optimizer changes parameters away from their initial values.
The high-frequency bitstrings have better-than-random cut values.
Repeated runs produce similar best cut values even if the exact angles differ.

If those conditions hold, the implementation is probably behaving sensibly.

What should you inspect?

Top measured bitstrings: Are they valid high-cut solutions?
Objective trace: Does the expected cut improve over time?
Parameter stability: Do multiple seeds lead to similar outcomes?

For small graphs, compare your result against a brute-force classical optimum. That gives you an immediate quality check.

Example 2: Raising depth from p = 1 to p = 2

Now estimate what changes when you double depth.

You double the number of variational parameters from 2 to 4.
You approximately double the number of cost-layer interactions.
You usually need more optimizer evaluations.
You may need more shots to distinguish real improvement from sampling noise.

This is the most common QAOA planning mistake: treating higher p as automatically better. On simulators, p = 2 may still be easy. On hardware, the extra entangling operations can erase gains.

A practical decision rule is:

increase p only after p-1 is stable across seeds and shot budgets

That is a better engineering discipline than just climbing depth because a notebook example did.

Example 3: Estimating experiment cost in effort rather than money

Even without vendor-specific pricing, you can estimate whether a QAOA experiment is lightweight or expensive in team time.

Consider this planning frame:

Implementation effort: graph encoding, circuit checks, result decoding.
Tuning effort: optimizer, initialization, shot count, depth.
Validation effort: classical baseline, repeated seeds, result interpretation.

For a small Max-Cut tutorial, implementation is usually the easy part. Tuning and validation take longer. That is why a reusable qaoa python example should include diagnostics, not just a final circuit.

Example 4: Comparing simulator and hardware expectations

On a simulator, the QAOA loop is mostly limited by runtime. On hardware, you add queueing, calibration drift, topology constraints, and noise sensitivity. The same graph instance may require different circuit layouts or reduced depth.

A simple readiness checklist:

Can you transpile the circuit without extreme gate expansion?
Does the hardware connectivity fit your graph interactions reasonably well?
Are your simulator results robust enough that noise will not wipe out the signal?

If not, stay in simulation and refine the problem size or ansatz design first.

When to recalculate

QAOA is exactly the kind of topic worth revisiting because the useful answer changes when your inputs change. Recalculate your plan whenever one of these moves:

1. Your problem graph changes

More vertices change qubit count. More edges change entangling load. Weighted edges can change the optimization landscape. Any meaningful graph change should trigger a new estimate.

2. Your target backend changes

A local simulator, managed simulator, and real hardware backend are different execution environments. If you switch platforms, revisit shot count, transpilation effects, and practical depth limits. The backend is not just a deployment detail; it changes the experiment design.

3. Optimizer behavior changes

If you swap COBYLA for another optimizer, your expected circuit evaluations may change significantly. Re-estimate total work before assuming the new choice is cheaper or better.

4. You increase p

This is the biggest update trigger. More depth means more parameters, more gate operations, and usually more optimization effort. Treat every increase in p as a new experiment, not a small tweak.

5. Your validation standard gets stricter

A toy demo only needs plausibility. A production-facing benchmark needs repeated seeds, classical baselines, and cleaner result summaries. As your standard rises, your measurement and analysis budget rises too.

Practical next steps

If you want a repeatable workflow, use this sequence:

Start with a 4- to 8-node Max-Cut graph.
Run p = 1 on a simulator.
Compare with a brute-force classical optimum.
Log objective history, best bitstrings, and parameter seeds.
Only then test p = 2.
Only after that consider hardware execution.

That progression keeps the theory barrier low and gives you a reusable process for future graphs and future SDK updates.

For broader context on what current platforms can and cannot support reliably, our Qubit Reality Check is a helpful companion read. And if you are tracking where practical tooling is maturing, the quantum company trends overview gives a useful market lens.

The main takeaway is simple: QAOA is best approached as an engineering loop, not just a formula. Estimate the qubits, estimate the depth, estimate the optimization workload, and keep the problem small enough that you can verify every step. If you do that, QAOA becomes much less mysterious and much more useful as a learning path into variational quantum algorithms.

QAOA Tutorial: From Cost Hamiltonian to a Working Python Example

Overview

How to estimate

1. Estimate circuit width

2. Estimate circuit depth

3. Estimate optimization workload

4. Estimate whether hardware is worth trying

Inputs and assumptions

The optimization problem

The cost Hamiltonian intuition

The mixer Hamiltonian intuition

Parameter assumptions

A working Qiskit example

Worked examples

Example 1: A first-pass simulator run

Example 2: Raising depth from p = 1 to p = 2

Example 3: Estimating experiment cost in effort rather than money

Example 4: Comparing simulator and hardware expectations

When to recalculate

1. Your problem graph changes

2. Your target backend changes

3. Optimizer behavior changes

4. You increase p

5. Your validation standard gets stricter

Practical next steps

Related Topics

Sharp Qubit Labs Editorial

Up Next

AWS Braket Tutorial: How to Run Your First Quantum Job and Compare Simulators

Cirq Tutorial: How to Build and Simulate Quantum Circuits in Python

Qiskit Tutorial for Beginners: A Step-by-Step Python Path From Circuits to Results