-
Notifications
You must be signed in to change notification settings - Fork 1.8k
/
Copy path06-strings.html
731 lines (728 loc) · 54 KB
/
06-strings.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
<!DOCTYPE html>
<head>
<!--<script type="text/javascript" src="https://trinket.io/js/trinket.js"></script>-->
<link rel="stylesheet" href="trinket/base.css" type="text/css" />
<link rel="stylesheet" href="trinket/trinket.css" type="text/css" />
<link rel="stylesheet" href="trinket/font-awesome.min.css" type="text/css" />
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { color: #008000; } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { color: #008000; font-weight: bold; } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<style>
/*Temporary Fix for Font Awesome ApplicatonManifest offline Cache */
@font-face {
font-family: 'FontAwesome';
src: url('../fonts/fontawesome-webfont.eot');
src: url('../fonts/fontawesome-webfont.eot') format('embedded-opentype'), url('../fonts/fontawesome-webfont.woff2') format('woff2'), url('../fonts/fontawesome-webfont.woff') format('woff'), url('../fonts/fontawesome-webfont.ttf') format('truetype'), url('../fonts/fontawesome-webfont.svg') format('svg');
font-weight: normal;
font-style: normal;
}
</style>
</head>
<body class="loggedout">
<div class="main-content">
<div class="nav-wrapper">
<nav class="top-bar" data-topbar data-topbar data-options="is_hover: false">
<ul class="title-area">
<li class="name">
<a href="/">
<img src="https://trinket.io/img/trinket-logo.png" alt="Hosted by Trinket" />
</a>
</li>
<li class="toggle-topbar menu-icon"><a href="#"><span></span></a></li>
</ul>
<section class="top-bar-section">
<ul class="right">
<li><a href="http://pythonlearn.com"><i class="fa fa-star"></i> PythonLearn</a></li>
<li><a href="https://trinket.io"><i class="fa fa-star"></i> Trinket</a></li>
<li><a href="https://hourofpython.com"><i class="fa fa-graduation-cap"></i> Hour of Python</a></li>
</ul>
</section>
</nav>
</div>
<div class="booktoc sticky">
<nav class="top-bar" data-topbar="" role="navigation">
<ul class="title-area">
<li class="name">
<h1 class="no-anchor"><a href="#">Python for Everyone</a></h1>
</li>
<li class="toggle-topbar"><a href="#"><span>Menu</span></a></li>
</ul>
<section class="top-bar-section">
<ul class="left">
<li class="has-dropdown not-click">
<a href="#">Chapters</a>
<ul class="dropdown">
<li class="title back js-generated">
<h5><a href="javascript:void(0)">Back</a></h5></li>
<li class="parent-link hide-for-medium-up"><a class="parent-link js-generated" href="#">Chapters</a></li>
<li><a href="index.html">See All Chapters</a></li>
<li><a href="01-intro.html">Chapter 1: Introduction</a></li>
<li><a href="02-variables.html">Chapter 2: Variables</a></li>
<li><a href="03-conditional.html">Chapter 3: Conditionals</a></li>
<li><a href="04-functions.html">Chapter 4: Functions</a></li>
<li><a href="05-iterations.html">Chapter 5: Iterations</a></li>
<li><a href="06-strings.html">Chapter 6: Strings</a></li>
<li><a href="07-files.html">Chapter 7: Files</a></li>
<li><a href="08-lists.html">Chapter 8: Lists</a></li>
<li><a href="09-dictionaries.html">Chapter 9: Dictionaries</a></li>
<li><a href="10-tuples.html">Chapter 10: Tuples</a></li>
<li><a href="11-regex.html">Chapter 11: Regex</a></li>
<li><a href="12-network.html">Chapter 12: Networked Programs</a></li>
<li><a href="13-web.html">Chapter 13: Python and Web Services</a></li>
<li><a href="14-objects.html">Chapter 14: Object Orientation</a></li>
<li><a href="15-database.html">Chapter 15: Python and Databases</a></li>
<li><a href="16-viz.html">Chapter 16: Data Vizualization</a></li>
</ul>
</li>
</ul>
</section>
</nav>
</div>
<div class="bookchapter">
<div class="row">
<div class="columns small-12">
<h1 id="strings">Strings</h1>
<h2 id="a-string-is-a-sequence">A string is a sequence</h2>
<p> </p>
<p>A string is a <em>sequence</em> of characters. You can access the
characters one at a time with the bracket operator:</p>
<div class="sourceCode" id="cb1"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> fruit <span class="op">=</span> <span class="st">'banana'</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> letter <span class="op">=</span> fruit[<span class="dv">1</span>]</span></code></pre></div>
<p> </p>
<p>The second statement extracts the character at index position 1 from
the <code>fruit</code> variable and assigns it to the
<code>letter</code> variable.</p>
<p>The expression in brackets is called an <em>index</em>. The index
indicates which character in the sequence you want (hence the name).</p>
<p>But you might not get what you expect:</p>
<div class="sourceCode" id="cb2"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(letter)</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a>a</span></code></pre></div>
<p>For most people, the first letter of “banana” is “b”, not “a”. But in
Python, the index is an offset from the beginning of the string, and the
offset of the first letter is zero.</p>
<div class="sourceCode" id="cb3"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> letter <span class="op">=</span> fruit[<span class="dv">0</span>]</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(letter)</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>b</span></code></pre></div>
<p>So “b” is the 0th letter (“zero-th”) of “banana”, “a” is the 1th
letter (“one-th”), and “n” is the 2th (“two-th”) letter.</p>
<figure>
<img src="../images/string.svg" alt="String Indexes" style="height: 0.75in;"/>
<figcaption>
String Indexes
</figcaption>
</figure>
<p> </p>
<p>You can use any expression, including variables and operators, as an
index, but the value of the index has to be an integer. Otherwise you
get:</p>
<p> </p>
<div class="sourceCode" id="cb4"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> letter <span class="op">=</span> fruit[<span class="fl">1.5</span>]</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="pp">TypeError</span>: string indices must be integers</span></code></pre></div>
<h2 id="getting-the-length-of-a-string-using-len">Getting the length of
a string using <code>len</code></h2>
<p> </p>
<p><code>len</code> is a built-in function that returns the number of
characters in a string:</p>
<div class="sourceCode" id="cb5"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> fruit <span class="op">=</span> <span class="st">'banana'</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">len</span>(fruit)</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="dv">6</span></span></code></pre></div>
<p>To get the last letter of a string, you might be tempted to try
something like this:</p>
<p> </p>
<div class="sourceCode" id="cb6"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> length <span class="op">=</span> <span class="bu">len</span>(fruit)</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> last <span class="op">=</span> fruit[length]</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a><span class="pp">IndexError</span>: string index out of <span class="bu">range</span></span></code></pre></div>
<p>The reason for the <code>IndexError</code> is that there is no letter
in “banana” with the index 6. Since we started counting at zero, the six
letters are numbered 0 to 5. To get the last character, you have to
subtract 1 from <code>length</code>:</p>
<div class="sourceCode" id="cb7"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> last <span class="op">=</span> fruit[length<span class="op">-</span><span class="dv">1</span>]</span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(last)</span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a>a</span></code></pre></div>
<p>Alternatively, you can use negative indices, which count backward
from the end of the string. The expression <code>fruit[-1]</code> yields
the last letter, <code>fruit[-2]</code> yields the second to last, and
so on.</p>
<p> </p>
<h2 id="traversal-through-a-string-with-a-loop">Traversal through a
string with a loop</h2>
<p> </p>
<p>A lot of computations involve processing a string one character at a
time. Often they start at the beginning, select each character in turn,
do something to it, and continue until the end. This pattern of
processing is called a <em>traversal</em>. One way to write a traversal
is with a <code>while</code> loop:</p>
<div class="sourceCode" id="cb8"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>index <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="cf">while</span> index <span class="op"><</span> <span class="bu">len</span>(fruit):</span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> letter <span class="op">=</span> fruit[index]</span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(letter)</span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> index <span class="op">=</span> index <span class="op">+</span> <span class="dv">1</span></span></code></pre></div>
<p>This loop traverses the string and displays each letter on a line by
itself. The loop condition is <code>index < len(fruit)</code>, so
when <code>index</code> is equal to the length of the string, the
condition is false, and the body of the loop is not executed. The last
character accessed is the one with the index <code>len(fruit)-1</code>,
which is the last character in the string.</p>
<p><strong>Exercise 1: Write a <code>while</code> loop that starts at
the last character in the string and works its way backwards to the
first character in the string, printing each letter on a separate line,
except backwards.</strong></p>
<p>Another way to write a traversal is with a <code>for</code> loop:</p>
<div class="sourceCode" id="cb9"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> char <span class="kw">in</span> fruit:</span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(char)</span></code></pre></div>
<p>Each time through the loop, the next character in the string is
assigned to the variable <code>char</code>. The loop continues until no
characters are left.</p>
<h2 id="string-slices">String slices</h2>
<p> </p>
<p>A segment of a string is called a <em>slice</em>. Selecting a slice
is similar to selecting a character:</p>
<div class="sourceCode" id="cb10"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> s <span class="op">=</span> <span class="st">'Monty Python'</span></span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(s[<span class="dv">0</span>:<span class="dv">5</span>])</span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a>Monty</span>
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(s[<span class="dv">6</span>:<span class="dv">12</span>])</span>
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a>Python</span></code></pre></div>
<p>The operator [n:m] returns the part of the string from the “n-th”
character to the “m-th” character, including the first but excluding the
last.</p>
<p>If you omit the first index (before the colon), the slice starts at
the beginning of the string. If you omit the second index, the slice
goes to the end of the string:</p>
<div class="sourceCode" id="cb11"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> fruit <span class="op">=</span> <span class="st">'banana'</span></span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> fruit[:<span class="dv">3</span>]</span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a><span class="co">'ban'</span></span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> fruit[<span class="dv">3</span>:]</span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a><span class="co">'ana'</span></span></code></pre></div>
<p>If the first index is greater than or equal to the second the result
is an <em>empty string</em>, represented by two quotation marks:</p>
<p></p>
<div class="sourceCode" id="cb12"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> fruit <span class="op">=</span> <span class="st">'banana'</span></span>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> fruit[<span class="dv">3</span>:<span class="dv">3</span>]</span>
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a><span class="co">''</span></span></code></pre></div>
<p>An empty string contains no characters and has length 0, but other
than that, it is the same as any other string.</p>
<p><strong>Exercise 2: Given that <code>fruit</code> is a string, what
does <code>fruit[:]</code> mean?</strong></p>
<p> </p>
<h2 id="strings-are-immutable">Strings are immutable</h2>
<p> </p>
<p>It is tempting to use the operator on the left side of an assignment,
with the intention of changing a character in a string. For example:</p>
<p> </p>
<div class="sourceCode" id="cb13"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> greeting <span class="op">=</span> <span class="st">'Hello, world!'</span></span>
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> greeting[<span class="dv">0</span>] <span class="op">=</span> <span class="st">'J'</span></span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a><span class="pp">TypeError</span>: <span class="st">'str'</span> <span class="bu">object</span> does <span class="kw">not</span> support item assignment</span></code></pre></div>
<p>The “object” in this case is the string and the “item” is the
character you tried to assign. For now, an <em>object</em> is the same
thing as a value, but we will refine that definition later. An
<em>item</em> is one of the values in a sequence.</p>
<p> </p>
<p>The reason for the error is that strings are <em>immutable</em>,
which means you can’t change an existing string. The best you can do is
create a new string that is a variation on the original:</p>
<div class="sourceCode" id="cb14"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> greeting <span class="op">=</span> <span class="st">'Hello, world!'</span></span>
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> new_greeting <span class="op">=</span> <span class="st">'J'</span> <span class="op">+</span> greeting[<span class="dv">1</span>:]</span>
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(new_greeting)</span>
<span id="cb14-4"><a href="#cb14-4" aria-hidden="true" tabindex="-1"></a>Jello, world<span class="op">!</span></span></code></pre></div>
<p>This example concatenates a new first letter onto a slice of
<code>greeting</code>. It has no effect on the original string.</p>
<p></p>
<h2 id="looping-and-counting">Looping and counting</h2>
<p> </p>
<p>The following program counts the number of times the letter “a”
appears in a string:</p>
<div class="sourceCode" id="cb15"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a>word <span class="op">=</span> <span class="st">'banana'</span></span>
<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a>count <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> letter <span class="kw">in</span> word:</span>
<span id="cb15-4"><a href="#cb15-4" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> letter <span class="op">==</span> <span class="st">'a'</span>:</span>
<span id="cb15-5"><a href="#cb15-5" aria-hidden="true" tabindex="-1"></a> count <span class="op">=</span> count <span class="op">+</span> <span class="dv">1</span></span>
<span id="cb15-6"><a href="#cb15-6" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(count)</span></code></pre></div>
<p>This program demonstrates another pattern of computation called a
<em>counter</em>. The variable <code>count</code> is initialized to 0
and then incremented each time an “a” is found. When the loop exits,
<code>count</code> contains the result: the total number of a’s.</p>
<p></p>
<p><strong>Exercise 3: Encapsulate this code in a function named
<code>count</code>, and generalize it so that it accepts the string and
the letter as arguments.</strong></p>
<h2 id="the-in-operator">The <code>in</code> operator</h2>
<p> </p>
<p>The word <code>in</code> is a boolean operator that takes two strings
and returns <code>True</code> if the first appears as a substring in the
second:</p>
<div class="sourceCode" id="cb16"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="st">'a'</span> <span class="kw">in</span> <span class="st">'banana'</span></span>
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a><span class="va">True</span></span>
<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="st">'seed'</span> <span class="kw">in</span> <span class="st">'banana'</span></span>
<span id="cb16-4"><a href="#cb16-4" aria-hidden="true" tabindex="-1"></a><span class="va">False</span></span></code></pre></div>
<h2 id="string-comparison">String comparison</h2>
<p> </p>
<p>The comparison operators work on strings. To see if two strings are
equal:</p>
<div class="sourceCode" id="cb17"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> word <span class="op">==</span> <span class="st">'banana'</span>:</span>
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(<span class="st">'All right, bananas.'</span>)</span></code></pre></div>
<p>Other comparison operations are useful for putting words in
alphabetical order:</p>
<div class="sourceCode" id="cb18"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> word <span class="op"><</span> <span class="st">'banana'</span>:</span>
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(<span class="st">'Your word,'</span> <span class="op">+</span> word <span class="op">+</span> <span class="st">', comes before banana.'</span>)</span>
<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a><span class="cf">elif</span> word <span class="op">></span> <span class="st">'banana'</span>:</span>
<span id="cb18-4"><a href="#cb18-4" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(<span class="st">'Your word,'</span> <span class="op">+</span> word <span class="op">+</span> <span class="st">', comes after banana.'</span>)</span>
<span id="cb18-5"><a href="#cb18-5" aria-hidden="true" tabindex="-1"></a><span class="cf">else</span>:</span>
<span id="cb18-6"><a href="#cb18-6" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(<span class="st">'All right, bananas.'</span>)</span></code></pre></div>
<p>Python does not handle uppercase and lowercase letters the same way
that people do. All the uppercase letters come before all the lowercase
letters, so:</p>
<pre><code>Your word, Pineapple, comes before banana.</code></pre>
<p>A common way to address this problem is to convert strings to a
standard format, such as all lowercase, before performing the
comparison. Keep that in mind in case you have to defend yourself
against a man armed with a Pineapple.</p>
<h2 id="string-methods">String methods</h2>
<p>Strings are an example of Python <em>objects</em>. An object contains
both data (the actual string itself) and <em>methods</em>, which are
effectively functions that are built into the object and are available
to any <em>instance</em> of the object.</p>
<p>Python has a function called <code>dir</code> which lists the methods
available for an object. The <code>type</code> function shows the type
of an object and the <code>dir</code> function shows the available
methods.</p>
<div class="sourceCode" id="cb20"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> stuff <span class="op">=</span> <span class="st">'Hello world'</span></span>
<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">type</span>(stuff)</span>
<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a><span class="op"><</span><span class="kw">class</span> <span class="st">'str'</span><span class="op">></span></span>
<span id="cb20-4"><a href="#cb20-4" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">dir</span>(stuff)</span>
<span id="cb20-5"><a href="#cb20-5" aria-hidden="true" tabindex="-1"></a>[<span class="st">'capitalize'</span>, <span class="st">'casefold'</span>, <span class="st">'center'</span>, <span class="st">'count'</span>, <span class="st">'encode'</span>,</span>
<span id="cb20-6"><a href="#cb20-6" aria-hidden="true" tabindex="-1"></a><span class="st">'endswith'</span>, <span class="st">'expandtabs'</span>, <span class="st">'find'</span>, <span class="st">'format'</span>, <span class="st">'format_map'</span>,</span>
<span id="cb20-7"><a href="#cb20-7" aria-hidden="true" tabindex="-1"></a><span class="st">'index'</span>, <span class="st">'isalnum'</span>, <span class="st">'isalpha'</span>, <span class="st">'isdecimal'</span>, <span class="st">'isdigit'</span>,</span>
<span id="cb20-8"><a href="#cb20-8" aria-hidden="true" tabindex="-1"></a><span class="st">'isidentifier'</span>, <span class="st">'islower'</span>, <span class="st">'isnumeric'</span>, <span class="st">'isprintable'</span>,</span>
<span id="cb20-9"><a href="#cb20-9" aria-hidden="true" tabindex="-1"></a><span class="st">'isspace'</span>, <span class="st">'istitle'</span>, <span class="st">'isupper'</span>, <span class="st">'join'</span>, <span class="st">'ljust'</span>, <span class="st">'lower'</span>,</span>
<span id="cb20-10"><a href="#cb20-10" aria-hidden="true" tabindex="-1"></a><span class="st">'lstrip'</span>, <span class="st">'maketrans'</span>, <span class="st">'partition'</span>, <span class="st">'replace'</span>, <span class="st">'rfind'</span>,</span>
<span id="cb20-11"><a href="#cb20-11" aria-hidden="true" tabindex="-1"></a><span class="st">'rindex'</span>, <span class="st">'rjust'</span>, <span class="st">'rpartition'</span>, <span class="st">'rsplit'</span>, <span class="st">'rstrip'</span>,</span>
<span id="cb20-12"><a href="#cb20-12" aria-hidden="true" tabindex="-1"></a><span class="st">'split'</span>, <span class="st">'splitlines'</span>, <span class="st">'startswith'</span>, <span class="st">'strip'</span>, <span class="st">'swapcase'</span>,</span>
<span id="cb20-13"><a href="#cb20-13" aria-hidden="true" tabindex="-1"></a><span class="st">'title'</span>, <span class="st">'translate'</span>, <span class="st">'upper'</span>, <span class="st">'zfill'</span>]</span>
<span id="cb20-14"><a href="#cb20-14" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">help</span>(<span class="bu">str</span>.capitalize)</span>
<span id="cb20-15"><a href="#cb20-15" aria-hidden="true" tabindex="-1"></a>Help on method_descriptor:</span>
<span id="cb20-16"><a href="#cb20-16" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-17"><a href="#cb20-17" aria-hidden="true" tabindex="-1"></a>capitalize(...)</span>
<span id="cb20-18"><a href="#cb20-18" aria-hidden="true" tabindex="-1"></a> S.capitalize() <span class="op">-></span> <span class="bu">str</span></span>
<span id="cb20-19"><a href="#cb20-19" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-20"><a href="#cb20-20" aria-hidden="true" tabindex="-1"></a> Return a capitalized version of S, i.e. make the first character</span>
<span id="cb20-21"><a href="#cb20-21" aria-hidden="true" tabindex="-1"></a> have upper case <span class="kw">and</span> the rest lower case.</span>
<span id="cb20-22"><a href="#cb20-22" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span></span></code></pre></div>
<p>While the <code>dir</code> function lists the methods, and you can
use <code>help</code> to get some simple documentation on a method, a
better source of documentation for string methods would be <a
href="https://docs.python.org/library/stdtypes.html#string-methods"
class="uri">https://docs.python.org/library/stdtypes.html#string-methods</a>.</p>
<p>Calling a <em>method</em> is similar to calling a function (it takes
arguments and returns a value) but the syntax is different. We call a
method by appending the method name to the variable name using the
period as a delimiter.</p>
<p>For example, the method <code>upper</code> takes a string and returns
a new string with all uppercase letters:</p>
<p> </p>
<p>Instead of the function syntax <code>upper(word)</code>, it uses the
method syntax <code>word.upper()</code>.</p>
<p></p>
<div class="sourceCode" id="cb21"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> word <span class="op">=</span> <span class="st">'banana'</span></span>
<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> new_word <span class="op">=</span> word.upper()</span>
<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(new_word)</span>
<span id="cb21-4"><a href="#cb21-4" aria-hidden="true" tabindex="-1"></a>BANANA</span></code></pre></div>
<p>This form of dot notation specifies the name of the method,
<code>upper</code>, and the name of the string to apply the method to,
<code>word</code>. The empty parentheses indicate that this method takes
no argument.</p>
<p></p>
<p>A method call is called an <em>invocation</em>; in this case, we
would say that we are invoking <code>upper</code> on the
<code>word</code>.</p>
<p></p>
<p>For example, there is a string method named <code>find</code> that
searches for the position of one string within another:</p>
<div class="sourceCode" id="cb22"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> word <span class="op">=</span> <span class="st">'banana'</span></span>
<span id="cb22-2"><a href="#cb22-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> index <span class="op">=</span> word.find(<span class="st">'a'</span>)</span>
<span id="cb22-3"><a href="#cb22-3" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(index)</span>
<span id="cb22-4"><a href="#cb22-4" aria-hidden="true" tabindex="-1"></a><span class="dv">1</span></span></code></pre></div>
<p>In this example, we invoke <code>find</code> on <code>word</code> and
pass the letter we are looking for as a parameter.</p>
<p>The <code>find</code> method can find substrings as well as
characters:</p>
<div class="sourceCode" id="cb23"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> word.find(<span class="st">'na'</span>)</span>
<span id="cb23-2"><a href="#cb23-2" aria-hidden="true" tabindex="-1"></a><span class="dv">2</span></span></code></pre></div>
<p>It can take as a second argument the index where it should start:</p>
<p> </p>
<div class="sourceCode" id="cb24"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb24-1"><a href="#cb24-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> word.find(<span class="st">'na'</span>, <span class="dv">3</span>)</span>
<span id="cb24-2"><a href="#cb24-2" aria-hidden="true" tabindex="-1"></a><span class="dv">4</span></span></code></pre></div>
<p>One common task is to remove white space (spaces, tabs, or newlines)
from the beginning and end of a string using the <code>strip</code>
method:</p>
<div class="sourceCode" id="cb25"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line <span class="op">=</span> <span class="st">' Here we go '</span></span>
<span id="cb25-2"><a href="#cb25-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line.strip()</span>
<span id="cb25-3"><a href="#cb25-3" aria-hidden="true" tabindex="-1"></a><span class="co">'Here we go'</span></span></code></pre></div>
<p>Some methods such as <em>startswith</em> return boolean values.</p>
<div class="sourceCode" id="cb26"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb26-1"><a href="#cb26-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line <span class="op">=</span> <span class="st">'Have a nice day'</span></span>
<span id="cb26-2"><a href="#cb26-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line.startswith(<span class="st">'Have'</span>)</span>
<span id="cb26-3"><a href="#cb26-3" aria-hidden="true" tabindex="-1"></a><span class="va">True</span></span>
<span id="cb26-4"><a href="#cb26-4" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line.startswith(<span class="st">'h'</span>)</span>
<span id="cb26-5"><a href="#cb26-5" aria-hidden="true" tabindex="-1"></a><span class="va">False</span></span></code></pre></div>
<p>You will note that <code>startswith</code> requires case to match, so
sometimes we take a line and map it all to lowercase before we do any
checking using the <code>lower</code> method.</p>
<div class="sourceCode" id="cb27"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb27-1"><a href="#cb27-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line <span class="op">=</span> <span class="st">'Have a nice day'</span></span>
<span id="cb27-2"><a href="#cb27-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line.startswith(<span class="st">'h'</span>)</span>
<span id="cb27-3"><a href="#cb27-3" aria-hidden="true" tabindex="-1"></a><span class="va">False</span></span>
<span id="cb27-4"><a href="#cb27-4" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line.lower()</span>
<span id="cb27-5"><a href="#cb27-5" aria-hidden="true" tabindex="-1"></a><span class="co">'have a nice day'</span></span>
<span id="cb27-6"><a href="#cb27-6" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> line.lower().startswith(<span class="st">'h'</span>)</span>
<span id="cb27-7"><a href="#cb27-7" aria-hidden="true" tabindex="-1"></a><span class="va">True</span></span></code></pre></div>
<p>In the last example, the method <code>lower</code> is called and then
we use <code>startswith</code> to see if the resulting lowercase string
starts with the letter “h”. As long as we are careful with the order, we
can make multiple method calls in a single expression.</p>
<p> </p>
<p><strong>Exercise 4: There is a string method called
<code>count</code> that is similar to the function in the previous
exercise. Read the documentation of this method at:</strong></p>
<p><a
href="https://docs.python.org/library/stdtypes.html#string-methods"
class="uri">https://docs.python.org/library/stdtypes.html#string-methods</a></p>
<p><strong>Write an invocation that counts the number of times the
letter a occurs in “banana”.</strong></p>
<h2 id="parsing-strings">Parsing strings</h2>
<p>Often, we want to look into a string and find a substring. For
example if we were presented a series of lines formatted as follows:</p>
<p><code>From stephen.marquard@</code><em><code> uct.ac.za</code></em><code> Sat Jan 5 09:14:16 2008</code></p>
<p>and we wanted to pull out only the second half of the address (i.e.,
<code>uct.ac.za</code>) from each line, we can do this by using the
<code>find</code> method and string slicing.</p>
<p>First, we will find the position of the at-sign in the string. Then
we will find the position of the first space <em>after</em> the at-sign.
And then we will use string slicing to extract the portion of the string
which we are looking for.</p>
<div class="sourceCode" id="cb28"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb28-1"><a href="#cb28-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> data <span class="op">=</span> <span class="st">'From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008'</span></span>
<span id="cb28-2"><a href="#cb28-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> atpos <span class="op">=</span> data.find(<span class="st">'@'</span>)</span>
<span id="cb28-3"><a href="#cb28-3" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(atpos)</span>
<span id="cb28-4"><a href="#cb28-4" aria-hidden="true" tabindex="-1"></a><span class="dv">21</span></span>
<span id="cb28-5"><a href="#cb28-5" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> sppos <span class="op">=</span> data.find(<span class="st">' '</span>,atpos)</span>
<span id="cb28-6"><a href="#cb28-6" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(sppos)</span>
<span id="cb28-7"><a href="#cb28-7" aria-hidden="true" tabindex="-1"></a><span class="dv">31</span></span>
<span id="cb28-8"><a href="#cb28-8" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> host <span class="op">=</span> data[atpos<span class="op">+</span><span class="dv">1</span>:sppos]</span>
<span id="cb28-9"><a href="#cb28-9" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="bu">print</span>(host)</span>
<span id="cb28-10"><a href="#cb28-10" aria-hidden="true" tabindex="-1"></a>uct.ac.za</span>
<span id="cb28-11"><a href="#cb28-11" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span></span></code></pre></div>
<p>We use a version of the <code>find</code> method which allows us to
specify a position in the string where we want <code>find</code> to
start looking. When we slice, we extract the characters from “one beyond
the at-sign through up to <em>but not including</em> the space
character”.</p>
<p>The documentation for the <code>find</code> method is available
at</p>
<p><a
href="https://docs.python.org/library/stdtypes.html#string-methods"
class="uri">https://docs.python.org/library/stdtypes.html#string-methods</a>.</p>
<h2 id="format-operator">Format operator</h2>
<p> </p>
<p>The <em>format operator</em>, <code>%</code> allows us to construct
strings, replacing parts of the strings with the data stored in
variables. When applied to integers, <code>%</code> is the modulus
operator. But when the first operand is a string, <code>%</code> is the
format operator.</p>
<p></p>
<p>The first operand is the <em>format string</em>, which contains one
or more <em>format sequences</em> that specify how the second operand is
formatted. The result is a string.</p>
<p></p>
<p>For example, the format sequence <code>%d</code> means that the
second operand should be formatted as an integer (“d” stands for
“decimal”):</p>
<div class="sourceCode" id="cb29"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb29-1"><a href="#cb29-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> camels <span class="op">=</span> <span class="dv">42</span></span>
<span id="cb29-2"><a href="#cb29-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="st">'</span><span class="sc">%d</span><span class="st">'</span> <span class="op">%</span> camels</span>
<span id="cb29-3"><a href="#cb29-3" aria-hidden="true" tabindex="-1"></a><span class="co">'42'</span></span></code></pre></div>
<p>The result is the string ‘42’, which is not to be confused with the
integer value 42.</p>
<p>A format sequence can appear anywhere in the string, so you can embed
a value in a sentence:</p>
<div class="sourceCode" id="cb30"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb30-1"><a href="#cb30-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> camels <span class="op">=</span> <span class="dv">42</span></span>
<span id="cb30-2"><a href="#cb30-2" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="st">'I have spotted </span><span class="sc">%d</span><span class="st"> camels.'</span> <span class="op">%</span> camels</span>
<span id="cb30-3"><a href="#cb30-3" aria-hidden="true" tabindex="-1"></a><span class="co">'I have spotted 42 camels.'</span></span></code></pre></div>
<p>If there is more than one format sequence in the string, the second
argument has to be a tuple<a href="#fn1" class="footnote-ref"
id="fnref1" role="doc-noteref"><sup>1</sup></a>. Each format sequence is
matched with an element of the tuple, in order.</p>
<p>The following example uses <code>%d</code> to format an integer,
<code>%g</code> to format a floating-point number (don’t ask why), and
<code>%s</code> to format a string:</p>
<div class="sourceCode" id="cb31"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb31-1"><a href="#cb31-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="st">'In </span><span class="sc">%d</span><span class="st"> years I have spotted </span><span class="sc">%g</span><span class="st"> </span><span class="sc">%s</span><span class="st">.'</span> <span class="op">%</span> (<span class="dv">3</span>, <span class="fl">0.1</span>, <span class="st">'camels'</span>)</span>
<span id="cb31-2"><a href="#cb31-2" aria-hidden="true" tabindex="-1"></a><span class="co">'In 3 years I have spotted 0.1 camels.'</span></span></code></pre></div>
<p>The number of elements in the tuple must match the number of format
sequences in the string. The types of the elements also must match the
format sequences:</p>
<p> </p>
<div class="sourceCode" id="cb32"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb32-1"><a href="#cb32-1" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="st">'</span><span class="sc">%d</span><span class="st"> </span><span class="sc">%d</span><span class="st"> </span><span class="sc">%d</span><span class="st">'</span> <span class="op">%</span> (<span class="dv">1</span>, <span class="dv">2</span>)</span>
<span id="cb32-2"><a href="#cb32-2" aria-hidden="true" tabindex="-1"></a><span class="pp">TypeError</span>: <span class="kw">not</span> enough arguments <span class="cf">for</span> <span class="bu">format</span> string</span>
<span id="cb32-3"><a href="#cb32-3" aria-hidden="true" tabindex="-1"></a><span class="op">>>></span> <span class="st">'</span><span class="sc">%d</span><span class="st">'</span> <span class="op">%</span> <span class="st">'dollars'</span></span>
<span id="cb32-4"><a href="#cb32-4" aria-hidden="true" tabindex="-1"></a><span class="pp">TypeError</span>: <span class="op">%</span>d <span class="bu">format</span>: a number <span class="kw">is</span> required, <span class="kw">not</span> <span class="bu">str</span></span></code></pre></div>
<p>In the first example, there aren’t enough elements; in the second,
the element is the wrong type.</p>
<p>The format operator is powerful, but it can be difficult to use. You
can read more about it at</p>
<p><a
href="https://docs.python.org/library/stdtypes.html#printf-style-string-formatting"
class="uri">https://docs.python.org/library/stdtypes.html#printf-style-string-formatting</a>.</p>
<h2 id="debugging">Debugging</h2>
<p></p>
<p>A skill that you should cultivate as you program is always asking
yourself, “What could go wrong here?” or alternatively, “What crazy
thing might our user do to crash our (seemingly) perfect program?”</p>
<p>For example, look at the program which we used to demonstrate the
<code>while</code> loop in the chapter on iteration:</p>
<script type="text/javascript">(function(d,l,s,i,c){function n(e){e=e.nextSibling;return (!e||e.nodeType!=3)?e:n(e);};function r(f){/in/.test(d.readyState) ? setTimeout(function(){r(f);},9):f()};l=d.getElementsByTagName('script');s=l[l.length-1];r(function(){i=n(s),c=n(i);i.setAttribute('data-src','https://trinket.io/tools/1.0/jekyll/embed/python3#code='+encodeURIComponent(c.nodeValue.replace(/^\s+|\s+$/g,'')));});})(document)</script>
<iframe width="100%" height="400" frameborder="0" marginwidth="0" marginheight="0" class="lazyload" allowfullscreen>
</iframe>
<!--
while True:
line = input('> ')
if line[0] == '#':
continue
if line == 'done':
break
print(line)
print('Done!')
# Code: http://www.py4e.com/code3/copytildone2.py
# Or select Download from this trinket's left-hand menu
-->
<p>Look what happens when the user enters an empty line of input:</p>
<div class="sourceCode" id="cb33"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb33-1"><a href="#cb33-1" aria-hidden="true" tabindex="-1"></a><span class="op">></span> hello there</span>
<span id="cb33-2"><a href="#cb33-2" aria-hidden="true" tabindex="-1"></a>hello there</span>
<span id="cb33-3"><a href="#cb33-3" aria-hidden="true" tabindex="-1"></a><span class="op">></span> <span class="co"># don't print this</span></span>
<span id="cb33-4"><a href="#cb33-4" aria-hidden="true" tabindex="-1"></a><span class="op">></span> <span class="bu">print</span> this<span class="op">!</span></span>
<span id="cb33-5"><a href="#cb33-5" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span> this<span class="op">!</span></span>
<span id="cb33-6"><a href="#cb33-6" aria-hidden="true" tabindex="-1"></a><span class="op">></span></span>
<span id="cb33-7"><a href="#cb33-7" aria-hidden="true" tabindex="-1"></a>Traceback (most recent call last):</span>
<span id="cb33-8"><a href="#cb33-8" aria-hidden="true" tabindex="-1"></a> File <span class="st">"copytildone.py"</span>, line <span class="dv">3</span>, <span class="kw">in</span> <span class="op"><</span>module<span class="op">></span></span>
<span id="cb33-9"><a href="#cb33-9" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> line[<span class="dv">0</span>] <span class="op">==</span> <span class="st">'#'</span>:</span>
<span id="cb33-10"><a href="#cb33-10" aria-hidden="true" tabindex="-1"></a><span class="pp">IndexError</span>: string index out of <span class="bu">range</span></span></code></pre></div>
<p>The code works fine until it is presented an empty line. Then there
is no zero-th character, so we get a traceback. There are two solutions
to this to make line three “safe” even if the line is empty.</p>
<p>One possibility is to simply use the <code>startswith</code> method
which returns <code>False</code> if the string is empty.</p>
<div class="sourceCode" id="cb34"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb34-1"><a href="#cb34-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> line.startswith(<span class="st">'#'</span>):</span></code></pre></div>
<p> </p>
<p>Another way is to safely write the <code>if</code> statement using
the <em>guardian</em> pattern and make sure the second logical
expression is evaluated only where there is at least one character in
the string:</p>
<div class="sourceCode" id="cb35"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb35-1"><a href="#cb35-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> <span class="bu">len</span>(line) <span class="op">></span> <span class="dv">0</span> <span class="kw">and</span> line[<span class="dv">0</span>] <span class="op">==</span> <span class="st">'#'</span>:</span></code></pre></div>
<h2 id="glossary">Glossary</h2>
<dl>
<dt>counter</dt>
<dd>
A variable used to count something, usually initialized to zero and then
incremented.
</dd>
<dt>empty string</dt>
<dd>
A string with no characters and length 0, represented by two quotation
marks.
</dd>
<dt>format operator</dt>
<dd>
An operator, <code>%</code>, that takes a format string and a tuple and
generates a string that includes the elements of the tuple formatted as
specified by the format string.
</dd>
<dt>format sequence</dt>
<dd>
A sequence of characters in a format string, like <code>%d</code>, that
specifies how a value should be formatted.
</dd>
<dt>format string</dt>
<dd>
A string, used with the format operator, that contains format sequences.
</dd>
<dt>flag</dt>
<dd>
A boolean variable used to indicate whether a condition is true or
false.
</dd>
<dt>invocation</dt>
<dd>
A statement that calls a method.
</dd>
<dt>immutable</dt>
<dd>
The property of a sequence whose items cannot be assigned.
</dd>
<dt>index</dt>
<dd>
An integer value used to select an item in a sequence, such as a
character in a string.
</dd>
<dt>item</dt>
<dd>
One of the values in a sequence.
</dd>
<dt>method</dt>
<dd>
A function that is associated with an object and called using dot
notation.
</dd>
<dt>object</dt>
<dd>
Something a variable can refer to. For now, you can use “object” and
“value” interchangeably.
</dd>
<dt>search</dt>
<dd>
A pattern of traversal that stops when it finds what it is looking for.
</dd>
<dt>sequence</dt>
<dd>
An ordered set; that is, a set of values where each value is identified
by an integer index.
</dd>
<dt>slice</dt>
<dd>
A part of a string specified by a range of indices.
</dd>
<dt>traverse</dt>
<dd>
To iterate through the items in a sequence, performing a similar
operation on each.
</dd>
</dl>
<h2 id="exercises">Exercises</h2>
<p><strong>Exercise 5: Take the following Python code that stores a
string:</strong></p>
<p><code>str = 'X-DSPAM-Confidence:</code><strong><code>0.8475</code></strong><code>'</code></p>
<p><strong>Use <code>find</code> and string slicing to extract the
portion of the string after the colon character and then use the
<code>float</code> function to convert the extracted string into a
floating point number.</strong></p>
<p> </p>
<p><strong>Exercise 6: Read the documentation of the string methods at
<a href="https://docs.python.org/library/stdtypes.html#string-methods"
class="uri">https://docs.python.org/library/stdtypes.html#string-methods</a>
You might want to experiment with some of them to make sure you
understand how they work. <code>strip</code> and <code>replace</code>
are particularly useful.</strong></p>
<p><strong>The documentation uses a syntax that might be confusing. For
example, in <code>find(sub[, start[, end]])</code>, the brackets
indicate optional arguments. So <code>sub</code> is required, but
<code>start</code> is optional, and if you include <code>start</code>,
then <code>end</code> is optional.</strong></p>
<aside id="footnotes" class="footnotes footnotes-end-of-document"
role="doc-endnotes">
<hr />
<ol>
<li id="fn1"><p>A tuple is a sequence of comma-separated values inside a
pair of parenthesis. We will cover tuples in Chapter 10<a href="#fnref1"
class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</aside>
</div>
</div>
</div>
</div>
<script type="text/javascript" src="trinket/jquery.min.js"></script>
<script type="text/javascript" src="trinket/jquery-ui.min.js"></script>
<script type="text/javascript" src="trinket/foundation.min.js"></script>
<script type="text/javascript" src="trinket/anchor.min.js"></script>
<script type="text/javascript" src="trinket/lazysizes.min.js"></script>
<script type="text/javascript" src='trinket/go.js'></script>
</body>