-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conjugate reduction in optimize annotated pass #11811
Conversation
One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 8927040115Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Update: this PR extends the When the new optimization is applicable, it can give significant reduction, however the quantum circuit has to be of a somewhat special form for that to happen (see the second form of the controlled-adder example from the PR description). I do believe that this is a useful reduction, however this needs to be followed up by (1) constructing various standard library circuits using modifiers (annotated operations), (2) extending the applicability of the pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like this as a pass, thanks for it! I left a couple of comments on legibility of the algorithm, and the main thing I might like to see is a couple more tests of cases where conjugate reduction would have a harder time, such as not being able to reduce the whole block or whether there's disjoint sets of qubits.
qiskit/dagcircuit/dagcircuit.py
Outdated
def op_successors(self, node): | ||
"""Get the list of "op" successors of a node in the dag.""" | ||
return [pred for pred in self.successors(node) if isinstance(pred, DAGOpNode)] | ||
|
||
def op_predecessors(self, node): | ||
"""Get the list of "op" predecessors of a node in the dag.""" | ||
return [pred for pred in self.predecessors(node) if isinstance(pred, DAGOpNode)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the DAGCircuit
methods like these return an iterator (such as the generator expression (pred for pred in ...)
) instead of a list
- it might be better to do that here too, if not only for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in a31b9e6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I guess we ought to have a release note (and ideally test) for the new DAGCircuit
methods. We can do that in a follow-up if there's not enough time right now.
@@ -51,6 +63,7 @@ def __init__( | |||
equivalence_library: Optional[EquivalenceLibrary] = None, | |||
basis_gates: Optional[List[str]] = None, | |||
recurse: bool = True, | |||
do_conjugate_reduction: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nitpicking: conjugate_reduction
is a little shorter and communicates (imo) the same information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason to have the longer name (I think) was to more clearly differentiate between the option of whether to apply conjugate reduction (currently called do_conjugate_reduction
) and the actual method that does this (currently called _conjugate_reduction
). The reason is that conjugate reduction is an option is because it's technically a more expensive optimization and we may want to have a way to switch it off in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not a big deal. The internal private attribute could still be called _do_conjugate_reduction
if you were worried about naming collisions while still permitting OptimizeAnnotated(conjugate_reduction=False)
, but it's fine as-is.
We return the operation ``op_new`` which a new custom instruction with definition | ||
``IP * A * IR``, where ``A`` is a new annotated-operation with modifiers ``M`` and | ||
base gate ``IQ``. | ||
""" | ||
p_dag, q_dag, r_dag = base_decomposition | ||
|
||
p_instr = Instruction( | ||
name="ip", num_qubits=op.base_op.num_qubits, num_clbits=op.base_op.num_clbits, params=[] | ||
) | ||
p_instr.definition = dag_to_circuit(p_dag) | ||
|
||
q_instr = Instruction( | ||
name="iq", num_qubits=op.base_op.num_qubits, num_clbits=op.base_op.num_clbits, params=[] | ||
) | ||
q_instr.definition = dag_to_circuit(q_dag) | ||
|
||
r_instr = Instruction( | ||
name="ir", num_qubits=op.base_op.num_qubits, num_clbits=op.base_op.num_clbits, params=[] | ||
) | ||
r_instr.definition = dag_to_circuit(r_dag) | ||
|
||
op_new = Instruction( | ||
"optimized", num_qubits=op.num_qubits, num_clbits=op.num_clbits, params=[] | ||
) | ||
num_control_qubits = op.num_qubits - op.base_op.num_qubits | ||
|
||
circ = QuantumCircuit(op.num_qubits, op.num_clbits) | ||
circ.append(p_instr, range(num_control_qubits, op.num_qubits)) | ||
circ.append( | ||
AnnotatedOperation(base_op=q_instr, modifiers=op.modifiers), range(op.num_qubits) | ||
) | ||
circ.append(r_instr, range(num_control_qubits, op.num_qubits)) | ||
op_new.definition = circ | ||
return op_new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use this nested ip
/iq
/ir
structure to do anything later? If not: perhaps setting the definition of new_op
to be the composition ip . Annotated(iq) . ir
and removing a level of nesting might be easier and involve fewer magic instruction names. If we do (always) use the structure, then perhaps we could formalise it into a ConjugateReduction(Instruction)
special-case class that has explicit fields for the components that we just store directly as DAGs? It might be easier to retrieve them for later inspection, if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great suggestion - done in 8e5858d. I also think that the flatter structure is better, but somehow forgot about compose
at the time when initially implementing this PR.
# optimization to only consider qubits involved in nodes at the front or at the back of the | ||
# circuit | ||
unchecked_qubits = set() | ||
|
||
# optimization to keep pairs of nodes for which the inverse check was performed and the nodes | ||
# were found to be not inverse to each other | ||
checked_node_pairs = set() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I understand which qubits go into unchecked_qubits
here - maybe it's just the name of the variable throwing me off.
I think we put all qubits that have a zero-predecessor and/or zero-successor gate (accounting for edges we already popped off) in here, because those are the only ones that could have an inverse pair bracketing the circuit. Is that right?
If so: perhaps a comment explaining that part of the algorithm might be a bit clearer than calling it an "optimisation". Referring to it as an "optimisation" implies that we're comparing to some generally known algorithm, but this is the first time that I'm thinking about any algorithm for this pass, so the comment isn't super clear to me. Something like outside_node_qubits
might be a little more descriptive for the variable name as well - I did a double-take seeing to_check = unchecked_qubits.copy()
, though I can see the logic.
Fwiw: checked_node_pairs
is implementing an optimisation called memoisation. We don't necessarily need to call it that by name, I'm just mentioning its name in case you think it's useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am renaming unchecked_qubits
to active_qubits
and improving the comments. At each "iteration" of looking for an inverse pair of gates, one from the start and one from the back of the circuit, we want to iterate (only) over the currently active qubits, as typically the number of such qubits is significantly smaller than the total number of qubits.
Memoisation - I have definitely heard the term before, but I can never remember the correct terminology.
# For each qubit, keep the nodes at the start or at the back of the circuit involving this qubit | ||
# (also allowing None and empty values). | ||
q_to_f = {} | ||
q_to_b = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling these front_node_for_qubit
and back_node_for_qubit
might be a shade easier to read than the abbreviations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks - doing this!
# in the future we want to include a more precise check whether a pair | ||
# of nodes are inverse | ||
if fn.op == bn.op.inverse(): | ||
# update q_to_f, q_to_b maps | ||
for q in fn.qargs: | ||
q_to_f[q] = None | ||
for q in bn.qargs: | ||
q_to_b[q] = None | ||
|
||
# see which other nodes become at the front and update information | ||
for node in dag.op_successors(fn): | ||
if node not in processed_nodes: | ||
in_degree[node] -= 1 | ||
if in_degree[node] == 0: | ||
for q in node.qargs: | ||
q_to_f[q] = node | ||
unchecked_qubits.add(q) | ||
|
||
# see which other nodes become at the back and update information | ||
for node in dag.op_predecessors(bn): | ||
if node not in processed_nodes: | ||
out_degree[node] -= 1 | ||
if out_degree[node] == 0: | ||
for q in node.qargs: | ||
q_to_b[q] = node | ||
unchecked_qubits.add(q) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't discard the other qargs
in an inverse pair out of to_check
, which might cause interesting edge-case interactions if we have a pair A(0, 1) . B(0, 1) . <...> . B(0, 1) . A(0, 1)
or something - we might strip off the A(0, 1)
pair while examining 0
, but then also strip off the B
pair while examining 1
. I'm not clear that that would actually cause any problems (what if A and B shared some but not all qargs?), but at a minimum it would be good to get some tests of that in.
Fwiw, if we wanted to sidestep it, then having the for q in to_check
loop be something like:
while to_check:
qubit = to_check.pop()
if fn == bn.inverse():
for q in fn.qargs:
to_check.discard(q)
could get around that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this can happen, but should be fine since the nodes get output in the correct order, so in your example we will add A(0, 1)
first and B(0, 1)
second. In fact, we probably even want this to happen to reduce the total number of main iterations (though probably it does not really matter). I will add more tests.
if (fn := q_to_f.get(q, None)) is None: | ||
continue | ||
if (bn := q_to_b.get(q, None)) is None: | ||
continue | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit weird to have fn
here but front_circuit
later. Might be good to call it the full front_node
(if not only because then I don't have to double-take about fn
being the keyword for functions in Rust lol).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renaming (and yes I am now also getting confusing thinking this is a function)
|
||
for node in back_block: | ||
back_circuit.apply_operation_back(node.op, node.qargs, node.cargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly it's a bit more efficient to do for node in reversed(back_block)
than back_block = back_block[::1]; for node in back_block
, but that said - we could also just use DAGCircuit.apply_operation_front
.
(Fwiw, calling reversed
on a Python list to make an iterator will produce an iterator that's just as fast as the regular forwards one.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, switching to DAGCircuit.apply_operation_front
# THe pass should simplify the gate | ||
qc_optimized_keys = qc_optimized.count_ops().keys() | ||
self.assertIn("optimized", qc_optimized_keys) | ||
self.assertNotIn("annotated", qc_optimized_keys) | ||
self.assertEqual(Operator(qc), Operator(qc_optimized)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to make some assertions that the pass has collected the gates into the right parts of the decomposition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have added multiple tests exploring various collection cases.
… rather than lists
Jake, many thanks for the review. I have addressed your comments and added more tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really cool optimisation, thanks! I'm feeling ok to merge this and let people start playing with it. Thanks for the fast changes!
* initial commit for the conjugate reduction optimization * do not go into definitions when unnecessary * release notes * typo * minor * rewriting as list comprehension * improving release notes * removing print statement * changing op_predecessors and op_successor methods to return iterators rather than lists * and removing explcit iter * constructing the optimized circuit using compose rather than append * improving variable names * adding test * adding tests exploring which gates get collected * more renaming --------- Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Summary
Close #11810.
This PR adds a new optimization to the
OptimizeAnnotated
transpiler pass. The reduction looks for annotated operations (objects of typeAnnotatedOperation
that consist of a base operationB
and a listM
of control, inverse and power modifiers) with the following properties:B
can be expressed asP -- Q -- R
withR = P^{-1}
.For such annotated operations, we can move the modifiers to be over the
Q
-part only, i.e. replaceM - [P -- Q -- R]
byP -- M-[Q] -- R
.Example: Controlled Draper QFT Adder
This has been one of the main motivating examples. Consider the
controlled_adder
circuit constructed below:The synthesized adder has the following operations:
[('cx', 746), ('p', 630), ('cp', 237), ('h', 84), ('ry', 16), ('ccx', 16)]
. (Note thatHighLevelSynthesis
is the transpiler pass that withintranspile
that decomposes the circuit into 1- and 2-qubit gates).On the other hand, the controlled adder is of the form
control - [QFT -- U -- QFT^{-1}]
, which can be simplified toQFT -- control-[U] -- IQFT
. By removing the controls overQFT`` and ``IQFT
parts of the circuit, one obtains significantly fewer gates in the transpiled circuit. The following is still not 100% perfect but showcases this reduction:Now the synthesized adder has the following operations:
[('cx', 306), ('p', 270), ('cp', 93), ('h', 44)]
.Example: CSWAP gate
CSwapGate
inQiskit
has the following definition:We get exactly the same decomposition for
SwapGate().control(1, annotated=True
) by writing Swap as 3 CXs and moving the control to the middle CX (as the two outer CXs are inverse of each other).Details and comments
There are many interesting discussion points related to all this.
There is still a question of how the
OptimizeAnnotated
pass should be integrating into the transpile flow; I am now exploring several options.The key part of the PR is the code (currently in
optimize_annotated.py
) that matches nodes at the front of aDAGCircuit
with nodes at the back of theDAGCircuit
; currently the check is simplyfn.op == bn.op.inverse()
; inverse nodes are collected intoP
andR
respectively. Would something like this be uesful in other contexts?We want to have a more general function that checks whether two DAG nodes
A
andB
are inverse of each other. Currently I am checking thatA.qargs = B.qargs
.A.cargs = B.cargs
andA.op = B.op.inverse()
. What is the most general check we could do? For instance, we could relax to check toset(A.qargs) = set(B.qargs)
but that would also require checking inverse up to a specific permutation of qubits, and that seems hard. In addition, the checkA.op = B.op.inverse()
is very constrained: it does detect thatS
is the inverse ofSdg
or thatCX
is the (self-) inverse ofCX
, but it does not detect thatQFT
is the inverse ofIQFT
in the Draper adder example, because indraper_qft_adder.py
the inverse-QFT is defined asQFT(num_qubits_qft, do_swaps=False).inverse().to_gate()
and notQFT(num_qubits_qft, do_swaps=False).to_gate().inverse()
. This is one of the reasons that we needed to decompose the circuit 3 times before the optimization became applicable. It would also currently not detect thatSomeGate.inverse(annotated=True)
is the inverse ofSomeGate
.Yet another problem with
DraperQFTAdder(num_state_qubits=num_state_qubits, kind="half").control(num_controls, annotated=True)
is that the definition of the base operation (aka ofDraperQFTAdder
) is not yet of the formQFT -- U -- IQFT
, but a circuit consisting of only a single gate, whose definition in turn is of the form above. So again the optimization does apply without decomposing the circuit a few times.Also please take a look at how I have implemented the circuit
P - M-[Q] - R
with modifiersM
moved to the middle part. It's quite ugly, and the question is whether this can be simplified? (It works, and as we saw theHighLevelSynthesis
pass works correctly on such circuits).