Skip to content

Commit

Permalink
(feat) Started rewriting the coding patterns chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
d-krupke committed Oct 4, 2024
1 parent ecdf68a commit 0c5b4b2
Show file tree
Hide file tree
Showing 2 changed files with 242 additions and 44 deletions.
143 changes: 121 additions & 22 deletions 06_coding_patterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,28 +12,66 @@

<!-- STOP_SKIP_FOR_README -->

In this section, we will explore various coding patterns that are essential for
structuring implementations for optimization problems using CP-SAT. While we
will not delve into the modeling of specific problems, our focus will be on
demonstrating how to organize your code to enhance its readability and
maintainability. These practices are crucial for developing robust and scalable
optimization solutions that can be easily understood, modified, and extended by
other developers. We will concentrate on basic patterns, as more complex
patterns are better understood within the context of larger problems and are
beyond the scope of this primer.

> [!WARNING]
>
> The naming conventions for patterns in optimization problems are not
> standardized. There is no comprehensive guide on coding patterns for
> optimization issues, and my insights are primarily based on personal
> experience. Most online examples tend to focus solely on the model, often
> presented as Jupyter notebooks or sequential scripts. The
> [gurobi-optimods](https://github.com/Gurobi/gurobi-optimods) provide the
> closest examples to production-ready code that I am aware of, yet they offer
> limited guidance on code structuring. I aim to address this gap, which many
> find challenging, though it is important to note that my approach is **highly
> opinionated**.
In this chapter, we will explore various coding patterns that help you structure
your implementations for optimization problems using CP-SAT. These patterns
become especially useful when working on complex problems that need to be solved
continuously and potentially under changing requirements.

In many cases, specifying the model and solving it is sufficient without the
need for careful structuring. However, there are situations where your models
are complex and require frequent iteration, either for performance reasons or
due to changing requirements. In such cases, it is crucial to have a good
structure in place to ensure that you can easily modify and extend your code
without breaking it, as well as to facilitate testing and comprehension. Imagine
you have a complex model and need to adapt a constraint due to new requirements.
If your code is not modular and your test suite is only able to test the entire
model, this small change will force you to rewrite all your tests. After a few
iterations, you might end up skipping the tests altogether, which is a dangerous
path to follow.

Another common issue in complex optimization models is the risk of forgetting to
add some trivial constraints to interlink auxiliary variables, which can render
parts of the model dysfunctional. If the dysfunctional part concerns
feasibility, you might still notice it if you have separately checked the
feasibility of the solution. However, if it involves the objective, such as
penalizing certain combinations, you may not easily notice that your solution is
suboptimal, as the penalties are not applied. Furthermore, implementing complex
constraints can be challenging, and a modular structure allows you to test these
constraints separately to ensure they work as intended. Test-driven development
(TDD) is an effective approach for implementing complex constraints quickly and
reliably.

The field of optimization is highly heterogeneous, and the percentage of
optimizers with a professional software engineering background seems
surprisingly low. Much of the optimization work is done by mathematicians,
physicists, and engineers who have deep expertise in their fields but limited
experience in software engineering. They are usually highly skilled and can
create excellent models, but their code is often not very maintainable and does
not follow software engineering best practices. Many problems are similar enough
that minimal explanation or structure is deemed sufficient—much like creating
plots by copying, pasting, and adjusting a familiar template. While this
approach may not be very readable, it is familiar enough for most people in the
field to understand. Additionally, it is typical for mathematicians to first
document the model and then implement it. From a software engineering
perspective, this workflow resembles the waterfall model, which lacks agility.

There appears to be a lack of literature on agile software development in
optimization, which this chapter seeks to address by presenting some patterns I
have found useful in my work. I asked a few senior colleagues in the field, and
unfortunately, they could not provide any useful resources either or did not
even see the need for such resources. For many use cases, the simple approach is
indeed sufficient. However, I have found that these patterns make my agile,
test-driven workflow much easier, faster, and more enjoyable. Note that this
chapter is largely based on my personal experience due to the limited
availability of references. I would be happy to hear about your experiences and
the patterns you have found useful in your work.

In the following sections, we will start with the basic function-based pattern
and then introduce further concepts and patterns that I have found valuable. We
will work on simple examples where the benefits of these patterns may not be
immediately apparent, but I hope you will see their potential in more complex
problems. The alternative would have been to provide complex examples, which
might have distracted from the patterns themselves.

### Simple Function

Expand Down Expand Up @@ -85,6 +123,67 @@ def solve_knapsack(
)
```

You can also add some more flexibility by allowing any solver parameters to be
passed to the solver.

```python
def solve_knapsack(
weights: List[int],
values: List[int],
capacity: int,
*,
time_limit: int = 900,
opt_tol: float = 0.01,
**kwargs,
) -> List[int]:
# initialize the model
model = cp_model.CpModel()
# ...
# Solve the model
solver = cp_model.CpSolver()
solver.parameters.max_time_in_seconds = time_limit # Solver time limit
solver.parameters.relative_gap_limit = opt_tol # Solver optimality tolerance
for key, value in kwargs.items():
setattr(solver.parameters, key, value)
# ...
```

Add some unit tests in some separate file (e.g., `test_knapsack.py`) to ensure
that the model works as expected.

> [!TIP]
>
> Write the tests before you write the code. This approach is known as
> test-driven development (TDD) and can help you to structure your code better
> and to ensure that your code works as expected. It also helps you to think
> about the API of your function before you start implementing it.
```python
# Make sure you have a proper project structure and can import your function
from myknapsacksolver import solve_knapsack

def test_knapsack_empty():
# Always good to have a test for the trivial case. The more trivial the
# case, the more likely it is that you forget it.
assert solve_knapsack([], [], 0) == []

def test_knapsack_nothing_fits():
# If nothing fits, we should get an empty solution
assert solve_knapsack([10, 20, 30], [1, 2, 3], 5) == []

def test_knapsack_one_item():
# If only one item fits, we should get this item
assert solve_knapsack([10, 20, 30], [1, 2, 3], 10) == [0]

def test_knapsack_all_items():
# If all items fit, we should get all items
assert solve_knapsack([10, 20, 30], [1, 2, 3], 100) == [0, 1, 2]
```

Using pytest, you can run all tests in the project with `pytest .`. Check
[Real Python](https://realpython.com/pytest-python-testing/) for a good tutorial
on pytest.

### Logging the Model Building

When working with more complex optimization problems, logging the model-building
Expand Down
143 changes: 121 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4421,28 +4421,66 @@ in many - is remarkable for a tool that is both free and open-source.
## Coding Patterns for Optimization Problems


In this section, we will explore various coding patterns that are essential for
structuring implementations for optimization problems using CP-SAT. While we
will not delve into the modeling of specific problems, our focus will be on
demonstrating how to organize your code to enhance its readability and
maintainability. These practices are crucial for developing robust and scalable
optimization solutions that can be easily understood, modified, and extended by
other developers. We will concentrate on basic patterns, as more complex
patterns are better understood within the context of larger problems and are
beyond the scope of this primer.

> [!WARNING]
>
> The naming conventions for patterns in optimization problems are not
> standardized. There is no comprehensive guide on coding patterns for
> optimization issues, and my insights are primarily based on personal
> experience. Most online examples tend to focus solely on the model, often
> presented as Jupyter notebooks or sequential scripts. The
> [gurobi-optimods](https://github.com/Gurobi/gurobi-optimods) provide the
> closest examples to production-ready code that I am aware of, yet they offer
> limited guidance on code structuring. I aim to address this gap, which many
> find challenging, though it is important to note that my approach is **highly
> opinionated**.
In this chapter, we will explore various coding patterns that help you structure
your implementations for optimization problems using CP-SAT. These patterns
become especially useful when working on complex problems that need to be solved
continuously and potentially under changing requirements.

In many cases, specifying the model and solving it is sufficient without the
need for careful structuring. However, there are situations where your models
are complex and require frequent iteration, either for performance reasons or
due to changing requirements. In such cases, it is crucial to have a good
structure in place to ensure that you can easily modify and extend your code
without breaking it, as well as to facilitate testing and comprehension. Imagine
you have a complex model and need to adapt a constraint due to new requirements.
If your code is not modular and your test suite is only able to test the entire
model, this small change will force you to rewrite all your tests. After a few
iterations, you might end up skipping the tests altogether, which is a dangerous
path to follow.

Another common issue in complex optimization models is the risk of forgetting to
add some trivial constraints to interlink auxiliary variables, which can render
parts of the model dysfunctional. If the dysfunctional part concerns
feasibility, you might still notice it if you have separately checked the
feasibility of the solution. However, if it involves the objective, such as
penalizing certain combinations, you may not easily notice that your solution is
suboptimal, as the penalties are not applied. Furthermore, implementing complex
constraints can be challenging, and a modular structure allows you to test these
constraints separately to ensure they work as intended. Test-driven development
(TDD) is an effective approach for implementing complex constraints quickly and
reliably.

The field of optimization is highly heterogeneous, and the percentage of
optimizers with a professional software engineering background seems
surprisingly low. Much of the optimization work is done by mathematicians,
physicists, and engineers who have deep expertise in their fields but limited
experience in software engineering. They are usually highly skilled and can
create excellent models, but their code is often not very maintainable and does
not follow software engineering best practices. Many problems are similar enough
that minimal explanation or structure is deemed sufficient—much like creating
plots by copying, pasting, and adjusting a familiar template. While this
approach may not be very readable, it is familiar enough for most people in the
field to understand. Additionally, it is typical for mathematicians to first
document the model and then implement it. From a software engineering
perspective, this workflow resembles the waterfall model, which lacks agility.

There appears to be a lack of literature on agile software development in
optimization, which this chapter seeks to address by presenting some patterns I
have found useful in my work. I asked a few senior colleagues in the field, and
unfortunately, they could not provide any useful resources either or did not
even see the need for such resources. For many use cases, the simple approach is
indeed sufficient. However, I have found that these patterns make my agile,
test-driven workflow much easier, faster, and more enjoyable. Note that this
chapter is largely based on my personal experience due to the limited
availability of references. I would be happy to hear about your experiences and
the patterns you have found useful in your work.

In the following sections, we will start with the basic function-based pattern
and then introduce further concepts and patterns that I have found valuable. We
will work on simple examples where the benefits of these patterns may not be
immediately apparent, but I hope you will see their potential in more complex
problems. The alternative would have been to provide complex examples, which
might have distracted from the patterns themselves.

### Simple Function

Expand Down Expand Up @@ -4494,6 +4532,67 @@ def solve_knapsack(
)
```

You can also add some more flexibility by allowing any solver parameters to be
passed to the solver.

```python
def solve_knapsack(
weights: List[int],
values: List[int],
capacity: int,
*,
time_limit: int = 900,
opt_tol: float = 0.01,
**kwargs,
) -> List[int]:
# initialize the model
model = cp_model.CpModel()
# ...
# Solve the model
solver = cp_model.CpSolver()
solver.parameters.max_time_in_seconds = time_limit # Solver time limit
solver.parameters.relative_gap_limit = opt_tol # Solver optimality tolerance
for key, value in kwargs.items():
setattr(solver.parameters, key, value)
# ...
```

Add some unit tests in some separate file (e.g., `test_knapsack.py`) to ensure
that the model works as expected.

> [!TIP]
>
> Write the tests before you write the code. This approach is known as
> test-driven development (TDD) and can help you to structure your code better
> and to ensure that your code works as expected. It also helps you to think
> about the API of your function before you start implementing it.
```python
# Make sure you have a proper project structure and can import your function
from myknapsacksolver import solve_knapsack

def test_knapsack_empty():
# Always good to have a test for the trivial case. The more trivial the
# case, the more likely it is that you forget it.
assert solve_knapsack([], [], 0) == []

def test_knapsack_nothing_fits():
# If nothing fits, we should get an empty solution
assert solve_knapsack([10, 20, 30], [1, 2, 3], 5) == []

def test_knapsack_one_item():
# If only one item fits, we should get this item
assert solve_knapsack([10, 20, 30], [1, 2, 3], 10) == [0]

def test_knapsack_all_items():
# If all items fit, we should get all items
assert solve_knapsack([10, 20, 30], [1, 2, 3], 100) == [0, 1, 2]
```

Using pytest, you can run all tests in the project with `pytest .`. Check
[Real Python](https://realpython.com/pytest-python-testing/) for a good tutorial
on pytest.

### Logging the Model Building

When working with more complex optimization problems, logging the model-building
Expand Down

0 comments on commit 0c5b4b2

Please sign in to comment.