Skip to content
wojdyr edited this page Apr 17, 2011 · 1 revision

IT Capacity Planning

Recently (2011) I've learned that Fityk was described in a book The Art of Capacity Planning by John Allspaw, published by O'Reilly in 2008. Page 77:

An open source program called fityk does a great job of curve-fitting equations to arbitrary data [...]. For our purposes, the full curve-fitting abilities of fityk are a distinct overkill. It was created for analyzing scientific data that can represent wildly dynamic datasets, not just growing and decaying data. While fityk is primarily a GUI-based application, a command-line version is also available, called cfityk. This version accepts commands that mimic what would have been done with the GUI, so it can be used to automate the curve fitting and forecasting.

The command file used by cfityk is nothing more than a script of actions you can write using the GUI version. Once you have the procedure choreographed in the GUI, you’ll be able to replay the sequence with different data via the command-line tool.

If you have a carriage return–delimited file of x-y data, you can feed it into a command script that can be processed by cfityk. The syntax of the command file is relatively straightforward, particularly for our simple case. Let’s go back to our storage consumption data for an example.

In the code example that follows, we have disk consumption data for a 15-day period, presented in increments of one data point per day. This data is in a file called storageconsumption.xy, and appears as displayed here:

1 14321.83119
2 14452.60193
3 14586.54003
4 14700.89417
5 14845.72223
6 15063.99681
7 15250.21164
8 15403.82607
9 15558.81815
10 15702.35007
11 15835.76298
12 15986.55395
13 16189.27423
14 16367.88211
15 16519.57105

The cfityk command file containing our sequence of actions to run a fit (generated using the GUI) is called fit-storage.fit, and appears as shown below:

@0 < '/home/jallspaw/storage-consumption.xy'
guess Quadratic
fit
info formula # changed, see the notes below
quit

This script imports our x-y data file, sets the equation type to a second-order polynomial (quadratic equation), fits the data, and then returns back information about the fit, such as the formula used [...]


I haven't read this book, but it has very good reviews and if you do capacity planning, go buy it!

I have two notes regarding this description.

In version 0.9.5 (after the book was released) the syntax was changed -- I updated one line in the script above (removing in @0).

In the book it is not explicitly written (or I missed it) why the results from fityk and Excel are different. It is because fityk employs weighted least squares regression. By default the weights are set as sqrt(y), which has theoretical justification if y is the number of independent events. If we set all weights to be equal:

S = 1

we get exactly the same results as from Excel.

Clone this wiki locally