|
28 | 28 | document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible'); |
29 | 29 | Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) { |
30 | 30 | link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1); |
31 | | - });</script><div class=content id=content><main><div class=sidetoc><nav class=pagetoc></nav></div><h1 id=file-properties><a class=header href=#file-properties>File Properties</a></h1><p>In this chapter, you'll learn how to view file details like line and word counts, file and disk sizes, file types, extract parts of file path, etc. You'll also learn how to change file properties like timestamps and permissions.<blockquote><p><img alt=info src=./images/info.svg> The <a href=https://github.com/learnbyexample/cli-computing/tree/master/example_files>example_files</a> directory has the scripts and sample input files used in this chapter.</blockquote><h2 id=wc><a class=header href=#wc>wc</a></h2><p>The <code>wc</code> command is useful to count the number of lines, words and characters for the given input(s).<p><strong>Examples</strong><pre><code class=language-bash># change to the 'example_files/text_files' directory |
| 31 | + });</script><div class=content id=content><main><div class=sidetoc><nav class=pagetoc></nav></div><h1 id=file-properties><a class=header href=#file-properties>File Properties</a></h1><p>In this chapter, you'll learn how to view file details like line and word counts, file and disk sizes, file types, extract parts of file path, etc. You'll also learn how to change file properties like timestamps and permissions.<blockquote><p><img alt=info src=./images/info.svg> The <a href=https://github.com/learnbyexample/cli-computing/tree/master/example_files>example_files</a> directory has the scripts and sample input files used in this chapter.</blockquote><h2 id=wc><a class=header href=#wc>wc</a></h2><p>The <code>wc</code> command is typically used to count the number of lines, words and characters for the given input(s). Here are some basic examples:<pre><code class=language-bash># change to the 'example_files/text_files' directory |
32 | 32 | $ cat greeting.txt |
33 | 33 | Hi there |
34 | 34 | Have a nice day |
|
48 | 48 | 6 25 greeting.txt |
49 | 49 | </code></pre><p>Filename won't be printed for stdin data. This is helpful to save the results in a variable for scripting purposes.<pre><code class=language-bash>$ wc -l <greeting.txt |
50 | 50 | 2 |
51 | | -</code></pre><p>Word count is based on whitespace separation. You'll have to pre-process the input if you do not want certain non-whitespace characters to influence the results. <code>tr</code> can be used to remove a particular set of characters (this command will be discussed in the <a href=/assorted-text-processing-tools.html>Assorted Text Processing Tools</a> chapter).<pre><code class=language-bash>$ echo 'apple ; banana ; cherry' | wc -w |
| 51 | +</code></pre><p>Word count is based on whitespace separation. You can pre-process the input to prevent certain non-whitespace characters to influence the results. <code>tr</code> can be used to remove a particular set of characters (this command will be discussed in the <a href=./assorted-text-processing-tools.html>Assorted Text Processing Tools</a> chapter).<pre><code class=language-bash>$ echo 'apple ; banana ; cherry' | wc -w |
52 | 52 | 5 |
53 | 53 |
|
54 | 54 | # remove characters other than alphabets and whitespace |
|
62 | 62 | 3 3 20 fruits.txt |
63 | 63 | 15 38 183 sample.txt |
64 | 64 | 20 47 228 total |
65 | | -</code></pre><p>You can use the <code>-L</code> to report the length of the longest line in the input (excluding the newline character of a line). Note that <code>-L</code> won't count non-printable characters and tabs are converted to equivalent spaces. Multibyte characters and <a href=https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries>grapheme clusters</a> will each be counted as <code>1</code> (depending on the locale, they might become non-printable too).<pre><code class=language-bash>$ echo 'apple' | wc -L |
| 65 | +</code></pre><p>You can use the <code>-L</code> option to report the length of the longest line in the input (excluding the newline character of a line). Note that <code>-L</code> won't count non-printable characters and tabs are converted to equivalent spaces. Multibyte characters and <a href=https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries>grapheme clusters</a> will each be counted as <code>1</code> (depending on the locale, they might become non-printable too).<pre><code class=language-bash>$ echo 'apple' | wc -L |
66 | 66 | 5 |
67 | 67 |
|
68 | 68 | $ echo 'αλεπού cag̈e' | wc -L |
|
75 | 75 |
|
76 | 76 | $ printf 'αλεπού' | wc -m |
77 | 77 | 6 |
78 | | -</code></pre><h2 id=du><a class=header href=#du>du</a></h2><p>The <code>du</code> command helps you estimate the size of files and directories.<p><strong>Examples</strong><p>By default, size is given in size in terms of 1024 bytes. All directories and sub-directories are recursively reported, but files are ignored. You can use the <code>-a</code> option if files should also be reported. <code>du</code> is one of the commands that require an explicit option (<code>-L</code> in this case) if you want symbolic links to be followed.<pre><code class=language-bash># change to the 'scripts' directory and source the 'du.sh' script |
| 78 | +</code></pre><h2 id=du><a class=header href=#du>du</a></h2><p>The <code>du</code> command helps you estimate the size of files and directories.<p>By default, size is given in size in terms of 1024 bytes. All directories and sub-directories are recursively reported, but files are ignored. You can use the <code>-a</code> option if files should also be reported. <code>du</code> is one of the commands that require an explicit option (<code>-L</code> in this case) if you want symbolic links to be followed.<pre><code class=language-bash># change to the 'scripts' directory and source the 'du.sh' script |
79 | 79 | $ source du.sh |
80 | 80 |
|
81 | 81 | # n * 1024 bytes |
|
112 | 112 | 8.0K todos |
113 | 113 | 48K projects |
114 | 114 | 7.4M report.log |
115 | | -</code></pre><h2 id=df><a class=header href=#df>df</a></h2><p>The <code>df</code> command gives you the space usage of file systems.<p><strong>Examples</strong><p><code>df</code> without path arguments will give information about all the currently mounted file systems. You can specify <code>.</code> to get information only for the current filesystem:<pre><code class=language-bash>$ df . |
| 115 | +</code></pre><h2 id=df><a class=header href=#df>df</a></h2><p>The <code>df</code> command gives you the space usage of file systems. <code>df</code> without path arguments will give information about all the currently mounted file systems. You can specify <code>.</code> to get information only for the current filesystem:<pre><code class=language-bash>$ df . |
116 | 116 | Filesystem 1K-blocks Used Available Use% Mounted on |
117 | 117 | /dev/sda1 98298500 58563816 34734748 63% / |
118 | 118 | </code></pre><p>Use <code>-h</code> option for human readable sizes. The <code>-B</code> option allows you to scale sizes by the specified amount. Use <code>--si</code> for size in powers of 1000 instead of 1024.<pre><code class=language-bash>$ df -h . |
|
131 | 131 | </code></pre><h2 id=stat><a class=header href=#stat>stat</a></h2><p>The <code>stat</code> command is useful to get details like file type, size, inode, permissions, last accessed and modified timestamps, etc. You'll get all of these details by default. The <code>-c</code> and <code>--printf</code> options can be used to display only the required details in a particular format.<pre><code class=language-bash># change to the 'scripts' directory and source the 'stat.sh' script |
132 | 132 | $ source stat.sh |
133 | 133 |
|
134 | | -# last accessed timestamp |
| 134 | +# %x gives last accessed timestamp |
135 | 135 | $ stat -c '%x' ip.txt |
136 | 136 | 2022-06-01 13:25:18.693823117 +0530 |
137 | 137 |
|
138 | | -# last modified timestamp |
| 138 | +# %y gives last modified timestamp |
139 | 139 | $ stat -c '%y' ip.txt |
140 | 140 | 2022-05-24 14:39:41.285714934 +0530 |
141 | 141 |
|
142 | | -# file size in bytes |
143 | | -# followed by a newline |
144 | | -# and then the inode value |
| 142 | +# %s gives file size in bytes |
| 143 | +# \n is used to get a newline |
| 144 | +# %i gives the inode value |
145 | 145 | # same as: stat --printf='%s\n%i\n' ip.txt |
146 | 146 | $ stat -c $'%s\n%i' ip.txt |
147 | 147 | 10 |
148 | 148 | 787224 |
149 | 149 |
|
150 | | -# quoted filenames |
| 150 | +# %N gives quoted filenames |
151 | 151 | # if input is a link, path it points to is also displayed |
152 | 152 | $ stat -c '%N' words.txt |
153 | 153 | 'words.txt' -> '/usr/share/dict/words' |
154 | | -</code></pre><p>You can also pass multiple file arguments:<pre><code class=language-bash># %n gives filenames |
| 154 | +</code></pre><p>You can also pass multiple file arguments:<pre><code class=language-bash># %s gives file size in bytes |
| 155 | +# %n gives filenames |
155 | 156 | $ stat -c '%s %n' ip.txt hi.sh |
156 | 157 | 10 ip.txt |
157 | 158 | 21 hi.sh |
158 | | -</code></pre><blockquote><p><img alt=info src=./images/info.svg> <img alt=info src=./images/info.svg> The <code>stat</code> command should be preferred instead of parsing <code>ls -l</code> output for file details. See <a href=https://mywiki.wooledge.org/ParsingLs>mywiki.wooledge: avoid parsing output of ls</a> and <a href=https://unix.stackexchange.com/q/128985/109046>unix.stackexchange: why not parse ls?</a> for explanation and other alternatives.</blockquote><h2 id=touch><a class=header href=#touch>touch</a></h2><p>As mentioned earlier, the <code>touch</code> command helps you change the timestamps of files. You can do so based on current timestamp, passing an argument, copying the value from another file and so on.<p><strong>Examples</strong><p>By default, <code>touch</code> updates both access and modification timestamp to the current time. You can use <code>-a</code> to change only access timestamp and <code>-m</code> to change only modification timestamp.<pre><code class=language-bash># change to the 'scripts' directory and source the 'touch.sh' script |
| 159 | +</code></pre><blockquote><p><img alt=info src=./images/info.svg> <img alt=info src=./images/info.svg> The <code>stat</code> command should be preferred instead of parsing <code>ls -l</code> output for file details. See <a href=https://mywiki.wooledge.org/ParsingLs>mywiki.wooledge: avoid parsing output of ls</a> and <a href=https://unix.stackexchange.com/q/128985/109046>unix.stackexchange: why not parse ls?</a> for explanation and other alternatives.</blockquote><h2 id=touch><a class=header href=#touch>touch</a></h2><p>As mentioned earlier, the <code>touch</code> command helps you change the timestamps of files. You can do so based on current timestamp, passing an argument, copying the value from another file and so on.<p>By default, <code>touch</code> updates both access and modification timestamps to the current time. You can use <code>-a</code> to change only access timestamp and <code>-m</code> to change only modification timestamp.<pre><code class=language-bash># change to the 'scripts' directory and source the 'touch.sh' script |
159 | 160 | $ source touch.sh |
160 | 161 |
|
161 | | -# last access and modification time |
| 162 | +# last access and modification timestamps |
162 | 163 | $ stat -c $'%x\n%y' fruits.txt |
163 | 164 | 2017-07-19 17:06:01.523308599 +0530 |
164 | 165 | 2017-07-13 13:54:03.576055933 +0530 |
165 | 166 |
|
166 | | -# update access and modification values to current time |
| 167 | +# update access and modification values to the current time |
167 | 168 | $ touch fruits.txt |
168 | 169 | $ stat -c $'%x\n%y' fruits.txt |
169 | 170 | 2022-06-14 13:01:25.921205889 +0530 |
|
189 | 190 | $ touch -c xyz.txt |
190 | 191 | $ ls xyz.txt |
191 | 192 | ls: cannot access 'xyz.txt': No such file or directory |
192 | | -</code></pre><h2 id=file><a class=header href=#file>file</a></h2><p>The <code>file</code> command helps you identify text encoding (ASCII, UTF-8, etc), whether the file is executable and so on.<p><strong>Examples</strong><p>Here are some examples to show how the <code>file</code> command behaves for different types:<pre><code class=language-bash># change to the 'scripts' directory and source the 'file.sh' script |
| 193 | +</code></pre><h2 id=file><a class=header href=#file>file</a></h2><p>The <code>file</code> command helps you identify text encoding (ASCII, UTF-8, etc), whether the file is executable and so on.<p>Here are some examples to show how the <code>file</code> command behaves for different types:<pre><code class=language-bash># change to the 'scripts' directory and source the 'file.sh' script |
193 | 194 | $ source file.sh |
194 | 195 | $ ls -F |
195 | 196 | hi.sh* ip.txt moon.png sunrise.jpg |
|
211 | 212 | </code></pre><p>You can use the <code>-b</code> option to avoid filenames in the output:<pre><code class=language-bash>$ file -b ip.txt |
212 | 213 | ASCII text |
213 | 214 | </code></pre><p>Here is an example of finding particular type of files, say <code>image</code> files.<pre><code class=language-bash># assuming filenames do not contain ':' or newline characters |
214 | | -# awk here helps to print first field of lines containing 'image data' |
| 215 | +# awk here helps to print the first field of lines containing 'image data' |
215 | 216 | $ find -type f -exec file {} + | awk -F: '/\<image data\>/{print $1}' |
216 | 217 | ./sunset.jpg |
217 | 218 | ./moon.png |
218 | 219 | </code></pre><blockquote><p><img alt=info src=./images/info.svg> See also <code>identify</code> command which "describes the format and characteristics of one or more image files".</blockquote><h2 id=basename><a class=header href=#basename>basename</a></h2><p>By default, the <code>basename</code> command will remove the leading directory component from the given path argument. Any trailing slashes will be removed before determining the portion to be extracted.<pre><code class=language-bash>$ basename /home/learnbyexample/example_files/scores.csv |
219 | 220 | scores.csv |
220 | 221 |
|
221 | | -# quote the arguments when needed |
| 222 | +# quote the arguments as needed |
222 | 223 | $ basename 'path with spaces/report.log' |
223 | 224 | report.log |
224 | 225 | </code></pre><p>You can use the <code>-s</code> option to remove a suffix from the filename. Usually used to remove the file extension.<pre><code class=language-bash>$ basename -s'.csv' /home/learnbyexample/example_files/scores.csv |
|
249 | 250 | </code></pre><p>You can use shell features like command substitution to combine the effects of <code>basename</code> and <code>dirname</code> commands.<pre><code class=language-bash># extract the second last path component |
250 | 251 | $ basename $(dirname /home/learnbyexample/example_files/scores.csv) |
251 | 252 | example_files |
252 | | -</code></pre><h2 id=chmod><a class=header href=#chmod>chmod</a></h2><p>This section will show how you can use the <code>chmod</code> command to change file permissions. Consider this example:<pre><code class=language-bash>$ mkdir practice_chmod |
| 253 | +</code></pre><h2 id=chmod><a class=header href=#chmod>chmod</a></h2><p>You can use the <code>chmod</code> command to change file and directory permissions. Consider this example:<pre><code class=language-bash>$ mkdir practice_chmod |
253 | 254 | $ cd practice_chmod |
254 | 255 | $ echo 'learnbyexample' > ip.txt |
255 | 256 |
|
|
282 | 283 | </code></pre><p>You can also use <code>mkdir -m</code> instead of the <code>mkdir+chmod</code> combination seen above. The argument to the <code>-m</code> option uses the same syntax as <code>chmod</code> (including the format that'll be discussed next).<pre><code class=language-bash>$ mkdir -m 750 backups |
283 | 284 | $ stat -c '%a %A' backups |
284 | 285 | 750 drwxr-x--- |
285 | | -</code></pre><blockquote><p><img alt=info src=./images/info.svg> You can use <code>chmod -R</code> to recursively change permissions. Use <code>find+exec</code> if you want to apply changes only for files filtered by some criteria.</blockquote><p><strong>Changing permissions for specific categories</strong><p>You can assign (<code>=</code>), add (<code>+</code>) or remove (<code>-</code>) permissions by using those symbols followed by one or more <code>rwx</code> permissions. This depends on <code>umask</code> value:<pre><code class=language-bash>$ umask |
| 286 | +</code></pre><blockquote><p><img alt=info src=./images/info.svg> You can use <code>chmod -R</code> to recursively change permissions. Use <code>find+exec</code> if you want to apply changes only for files filtered by some criteria.</blockquote><p><strong>Changing permissions for specific categories</strong><p>You can assign (<code>=</code>), add (<code>+</code>) or remove (<code>-</code>) permissions by using those symbols followed by one or more <code>rwx</code> permissions. This depends on the <code>umask</code> value:<pre><code class=language-bash>$ umask |
286 | 287 | 0002 |
287 | 288 | </code></pre><p><code>umask</code> value of <code>0002</code> means:<ul><li>read and execute permissions without <code>ugo</code> prefix affects all the three categories<li>write permissions without <code>ugo</code> prefix affects only <code>user</code> and <code>group</code> categories</ul><p>Here are some examples without <code>ugo</code> prefixes:<pre><code class=language-bash># remove execute permission for all three categories |
288 | 289 | $ chmod -x hi.sh |
|
353 | 354 | </code></pre><p><strong>13)</strong> Is the following <code>touch</code> command valid? If so, what would be the output of the <code>stat</code> command that follows?<pre><code class=language-bash># change to the 'scripts' directory and source the 'touch.sh' script |
354 | 355 | $ source touch.sh |
355 | 356 |
|
| 357 | +$ stat -c '%n: %y' fruits.txt |
| 358 | +fruits.txt: 2017-07-13 13:54:03.576055933 +0530 |
| 359 | + |
356 | 360 | $ touch -r fruits.txt f{1..3}.txt |
357 | 361 | $ stat -c '%n: %y' f*.txt |
358 | 362 | # ??? |
|
366 | 370 | </code></pre><p><strong>16)</strong> Given the file path in the shell variable <code>p</code>, how'd you obtain the output shown below?<pre><code class=language-bash>$ p='~/projects/square_tictactoe/python/game.py' |
367 | 371 | $ dirname # ??? |
368 | 372 | ~/projects/square_tictactoe |
369 | | -</code></pre><p><strong>17)</strong> Explain what each of the characters mean in the following output.<pre><code class=language-bash>$ stat -c '%A' ../scripts/ |
| 373 | +</code></pre><p><strong>17)</strong> Explain what each of the characters mean in the following <code>stat</code> command's output.<pre><code class=language-bash>$ stat -c '%A' ../scripts/ |
370 | 374 | drwxrwxr-x |
371 | 375 | </code></pre><p><strong>18)</strong> What would be the output of the second <code>stat</code> command shown below?<pre><code class=language-bash>$ touch new_file.txt |
372 | 376 | $ stat -c '%a %A' new_file.txt |
|
0 commit comments