-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathfaq.html
395 lines (258 loc) · 17.4 KB
/
faq.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
---
layout: default
title: FAQ
---
<h3>How do I enable debug logging?</h3>
<p>Set the following in piler.conf, then restart all piler daemons</p>
<code>
verbosity=5
</code>
<h3>After upgrading from openssl 1.1 to 3.x reindex crashes</h3>
<p><a href="https://bitbucket.org/jsuto/piler/issues/1284/reindexer-has-problems-in-digest_string" target="_blank">https://bitbucket.org/jsuto/piler/issues/1284/reindexer-has-problems-in-digest_string</a></p>
<h3>What timezone should I use?</h3>
<p>I suggest to use UTC.</p>
<h3>Archive size mismatch</h3>
<p>Piler stores emails and attachments as distinct files in the filesystem. The filesystem type (eg. ext4, xfs, …) and its settings affects the used disk space. There's a difference between the apparent size and the size occupied on disk. The latter might be significantly larger especially with bigger block sizes when you have mostly small messages.</p>
<p>If you know that's the case for you in advance, then try using xfs with blocksize of 1024.</p>
<p>See more at <a href="https://bitbucket.org/jsuto/piler/issues/961/archive-size-mismatch" target="_blank">https://bitbucket.org/jsuto/piler/issues/961/archive-size-mismatch</a></p>
<h3>What performance can piler achieve?</h3>
<p>The available performance depends on the HW capacities you have, eg. number of CPU cores, memory, disk IO, network speed, as well as application settings (especially mysql innodb parameters, see below) etc.</p>
<p>I've done the following performance testing on various configurations:</p>
<p>I was running the following command to test:</p>
<code>
smtp-source -c -s 20 -m 10000 -l 100000 127.0.0.1:25
</code>
<p>It sent 10,000 of 100 kB messages to piler to the localhost using 20 concurrent connections. The aim of this was to eliminate any network issue. The messages were generated by smtp-source (part of the postfix package). Note that they were artificial emails without any attachment to prevent any external helper utility to introduce a performance penalty.</p>
<p>I've used Scaleway to host the various instances and run Ubuntu Xenial 64-bit.</p>
<p>A) VC1S instance (2 X86 64bit Cores, 2GB memory), 3 piler child processes: #1: 4:11 = 2340 msg/s #2: 4:05 = 2400 msg/s #3: 4:03 = 2460 msg/s</p>
<p>B) VC1L instance (6 X86 64bit Cores, 8GB memory), 7 piler child processes: #1: 2:19 = 4260 msg/s #2: 2:08 = 4680 msg/s #3: 2:01 = 4920 msg/s</p>
<p>TODO: Add some explanation, etc.</p>
<h3>How to change the retention for all emails?</h3>
<p>Connect to the piler mysql database, then run the following SQL statement to set the retained column to 180 days:</p>
<code>
update metadata set retained = sent + 86400*180;
</code>
<h3>pilerimport: IMAP username cannot contain spaces</h3>
</p>Use double quotes inside of single quotes, eg.</p>
<code>
pilerimport -u '“User Name”' ...
</code>
<p>See <a href="https://bitbucket.org/jsuto/piler/issues/952/pilerimport-imap-username-cannot-contain" target="_blank">https://bitbucket.org/jsuto/piler/issues/952/pilerimport-imap-username-cannot-contain</a> for more.</p>
<h3>What are the recommended mysql settings?</h3>
<p>Several key information, data are stored in mysql, so you have to make sure mysql has all the resources it needs to perform properly. Quite often the default settings are not enough beyond a certain archive size, because you need bigger buffers, etc.</p>
<p>As a guideline, consider my settings for a not that large archive. Your mileage may wary, so be sure to check out the mysql documentation for innodb!</p>
<code>
<pre>
innodb_buffer_pool_size = 256M
innodb_flush_log_at_trx_commit=1
innodb_log_buffer_size=64M
innodb_log_file_size=64M
innodb_read_io_threads=4
innodb_write_io_threads=4
innodb_log_files_in_group=2
innodb_file_per_table
</pre>
</code>
<h3>How to backup and restore the archive?</h3>
<p>You have two options:</p>
<p>#1: every day export yesterday emails, eg.</p>
<code>
<pre>
cd /opt/backup
pilerexport -a 2017.09.18 -b 2017.09.18
tar cfz /path/to/backup-2017-09-18.tar.gz *
rm *
</pre>
</code>
<p>The restore procedure in this case is to deploy piler from scratch, then import all the saved emails.</p>
<p>#2: backup the piler, sphinx data, and the configs</p>
<p>Backup regularly the following:</p>
<ul>
<li>/usr/local/etc/piler directory (piler and sphinx config files, including piler.key file)</li>
<li>/var/piler/store/00 directory contents (note that only the last directory in '00' changes)</li>
<li>/var/piler/sphinx/main[1-4]* files</li>
<li>piler mysql database</li>
<li>nginx/apache configs</li>
</ul>
<p>The restore procedure is to deploy piler, then restore the previously saved files and directories mentioned above.</p>
<h3>mysql_stmt_execute error: *Incorrect string value: '\xF0\x9D\x91\x83\xF0\x9D…' for column 'body' at row 1* (errno: 1366)</h3>
<p>Mysql supports utf-8 character set. However, mysql does NOT support the complete utf-8 code table, only a subset of it. If the email has a character or symbol in the unsupported region (think of the 4 byte values), then it can't store such a value if you are using mysql 5.7 or newer (or mariadb 10.2 or newer).</p>
<p>The solution is either to use an older mysql version (eg. 5.5 or mariadb 10.1) which is more forgiving with 'invalid' (=read valid, but not supported by mysql version's of utf-8) utf-8 character sequences. Or switch to utf8mb4 encoding which supports the whole range of the utf-8 characters.</p>
<p>See <a href="https://mathiasbynens.be/notes/mysql-utf8mb4" target="_blank">https://mathiasbynens.be/notes/mysql-utf8mb4</a> for a detailed guide on the problem, and the solution. The generic steps are (before executing any of these, be sure to read the mentioned article!):</p>
<code>
<pre>
# For each database:
ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
# For each table:
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
# For each column:
ALTER TABLE table_name CHANGE column_name column_name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
</pre>
</code>
<p>Note that piler 1.3.x uses utf8mb4 by default, so this issue is fixed.</p>
<p>Also, you may check out <a href="https://bitbucket.org/jsuto/piler/issues/1088/provide-instructions-on-how-to-migrate" target="_blank">https://bitbucket.org/jsuto/piler/issues/1088/provide-instructions-on-how-to-migrate</a> for a user contributed procedure.</p>
<h3>I can see mysql_stmt_execute error: *Out of range value for column 'retained' at row 1 errors in the mail log</h3>
<p>See <a href="https://bitbucket.org/jsuto/piler/issues/813/" target="_blank">#813</a> fix on Bitbucket.</p>
<h3>My older emails are gone!</h3>
<p>Follow these instructions to figure out if you really lost your emails. Chances are that you haven't!</p>
<ul>
<li>check the metadata table, and verify that your emails are there</li>
<li>find some of your “lost” emails under the /var/piler/store/00 directory. The files are named based on the their metadata.piler_id value. If you can find them, then your emails are in the archive.</li>
<li>double check with the pilerget utility, ie. if eg. <code>pilerget 4000000057950fd127ea028c005a30f2f951</code> returns the email (don't forget to use your *actual* piler_id!), then the email is in the archive</li>
<li>check the sphinx database (not a mysql database, sphinx has its own database which can be accessed with <code>mysql -h 127.0.0.1 -P 9306</code>) if the older IDs (the same as in metadata.id) are present: <code>select * from main1,dailydelta1,delta1 where id=123;</code> (also use the actual id!)</li>
</ul>
<h3>Why am I getting a “cannot write current directory!” message?</h3>
<p>Most of the piler binaries are setuid/setgid to piler:piler in order to read/write the store directory. It means that these commands run as user/group piler. The above error message is pretty self explaining: the piler user can't write the current directory. Solution: <code>cd /tmp</code> or <code>cd /var/piler/imap</code>, ie. cd to a directory where piler has rwx permissions.</p>
<h3>What's the 'master branch'?</h3>
<p>The master branch is the developer version of piler, kinda the nightly build. This contains all the latest (and probably unstable) code. It doesn't mean that the master branch is a buggy crap, however it does mean that the code is not in its final shape. Just with any other software, unless you are told otherwise, please use a stable version.</p>
<h3>What exclusion (formerly: archive) and retention rule do I need to archive every email, and keep them for 5 years?</h3>
<p>By default piler archives everything it receives, so you don't need any rules for that. In fact, the exclusion (formerly: archiving) rules define what NOT to archive. Also the default_retention_days value in piler.conf applies to all emails, unless it's overridden in the retention rules. So in this just set <code>default_retention_days=1827</code>, restart piler, and you are fine.</p>
<h3>I can see lots of duplicates in the archive</h3>
<p>The duplicate detection is based on the Message-ID header field. If the piler daemon detects an already known message-id, then it discards the email. As you can see, the Message-ID header is crucial. The piler daemon may archive a single message with a missing message-id field, and discards the subsequent messages having no message-id.</p>
<p>The pilerimport utility, however, accepts messages even without message-id, because it assumes you really want to archive those messages. So be sure to remove any duplicates before importing them with pilerimport, otherwise you'll have duplicates in the archive.</p>
<p>The only negative side effect is the extra resources required, eg. disk space, memory, etc.</p>
<p>Some internal email sending applications don't know how to create a valid email header, and neglect the message-id. They should be fixed ASAP!</p>
<h3>I don't have a /var partition, how to make the disk space projection work?</h3>
<p>Set the following in config-site.php:</p>
<code>
</pre>
$config['DATA_PARTITION'] = '/';
</pre>
</code>
<h3>I've just sent a few test emails, however no search results in the gui</h3>
<p>The piler daemon processes email as soon as they are received. However emails are indexed in every 30 mins. So an email is visible in the GUI 30 mins. You may adjust the indexing frequency by editing the cron jobs of piler.</p>
<p>Note that piler supports manticore as a drop-in replacement of sphinx. Also, both manticore and sphinx supports real-time (RT) index data. In that case you don't need the indexer jobs in crontab, and the archived emails are visible in the gui immediately.</p>
<h3>I can't see any results, however the sphinx query in the maillog reports 0 hits, and >0 total found:</h3>
<code>
Aug 11 20:21:36 mailpiler piler-webui[10670]: sphinx query: 'SELECT id FROM main1,dailydelta1,delta1 WHERE MATCH('@(subject,body) whatever I need') ORDER BY `sent` DESC LIMIT 0,1000 OPTION max_matches=1000' in 0.29 s, 0 hits, 25508 total found
</code>
<p>You use an outdated and not supported version of sphinx, typically 2.0.x on older Debian or Ubuntu releases. Make sure you have a recent 3.x version, or even manticore search 6.x. You may download it from http://sphinxsearch.com/downloads/release/.</p>
<h3>I want to see more hits!</h3>
<p>Edit config-site.php, and set MAX_SEARCH_HITS, eg.:</p>
<code>
<pre>
$config['MAX_SEARCH_HITS'] = 2000;
</pre>
</code>
<p>Note that you should keep this number to a sane value, the bigger it is the more resources you need for both searchd and php.</p>
<h3>I forgot the admin@local password</h3>
<p>Connect to the piler database, and set the crypted password manually, eg:</p>
<code>
<pre>
mysql> update user set password='$5$TXL7EX$s17XtxwbCs1MDAzuulF/STauTkH0h/KJGHudlNQt3R4' where uid=0;
</pre>
</code>
<p>The above crypted password sets the admin@local password to <strong>pilerrocks</strong></h3>
<h3>How can I delete everything from the archive?</h3>
<p>Follow the procedure below to do so, and it will reset the archive as if it were freshly installed.</p>
<p>#1: Stop searchd and piler:</p>
<code>
<pre>
/etc/init.d/rc.searchd stop
/etc/init.d/rc.piler stop
</pre>
</code>
<p>#2: Remove stored emails:</p>
<code>
<pre>
rm -rf /var/piler/store/00/*
</pre>
</code>
<p>#3: Drop and recreate the piler database:</p>
<code>
<pre>
mysql> drop database piler;
mysql> create database piler character set utf8;
mysql -u piler -p piler < /path/to/piler-source-directory/util/db-mysql.sql
</pre>
</code>
<p>#4: Reset the sphinx index data</p>
<code>
<pre>
rm -rf /var/piler/sphinx/*
su - piler
indexer --all
</pre>
</code>
<p>#5: Start searchd and piler:</p>
<code>
<pre>
/etc/init.d/rc.searchd start
/etc/init.d/rc.piler start
</pre>
</code>
<h3>Qmail (using as a smarthost) rejects emails from piler saying: 451 See http://pobox.com/~djb/docs/smtplf.html</h3>
<p>Set the following in config-site.php:</p>
<code>
<pre>
$config['EOL'] = "\r\n";
</pre>
</code>
<h3>I can see only today's emails in the archive and not any single previous emails.</h3>
<p>Some Linux distributions (notably Debian and Ubuntu) have a daily cron job to reindex everything. Unfortunately this ruins the sphinx index files piler relies on. However the older emails are not lost you still have them, they are just disappeared from the sphinx index. To bring them back, perform the following steps.</p>
<p>1. Edit /etc/default/sphinxsearch, and set <code>START="no"</code>.</p>
<p>I recommend you to use the piler shipped init.d/rc.searchd script to start searchd. You may call it from /etc/rc.local. (Note that it starts it as user piler, so make sure /var/piler/sphinx has proper ownership.) Or you may use the systemd services for piler, see the systemd dir in the piler source code.</p>
<p>2. Reindex old emails. After that older emails should appear after the next indexing is done.</p>
<code>
<pre>
cd /tmp
reindex -a
</pre>
</code>
<h3>How to add support for CJK (Chinese, Japanese, and Korean) languages?</h3>
<p>Fix the sphinx indexer settings, ie. add the following two lines to all “index” sections in sphinx.conf:</p>
<code>
<pre>
ngram_len = 1
ngram_chars = U+3000..U+2FA1F
</pre>
</code>
Finally chdir to /tmp, the reindex everything as the piler user.</p>
<code>
<pre>
cd /tmp
su piler
reindex -a
</pre>
</code>
<h3>I get a lot of “Format a4 is redefined”, and other messages when importing.</h3>
<p>These warning or errors come from the external helper utilities, eg. pdftotext.</p>
<p>They are usually harmless, since piler is interested in only the textual part. However in case of a worst case scenario the given attachment is simply not indexed, and thus is not searchable, but the email is archived, and can be retrieved without any problem.</p>
<h3>Outlook 2007 can't open a downloaded EML file.</h3>
<p>Check out <a href="http://www.msoutlook.info/question/354" target="blank">http://www.msoutlook.info/question/354</a> for a registry hack.</p>
<h3>You get the following error: “Error: SQLSTATE[HY000] [2003] Can't connect to MySQL server on '127.0.0.1' (111) on database: sphinx”</h3>
<p>Start searchd</p>
<h3>I have Centos / Redhat, and when I click on the “Apply” button in the gui, it says: “add the following to /etc/sudoers: 'www-data all=NOPASSWD: /etc/init.d/rc.piler reload'“</h3>
<p>Add the following to /etc/sudoers:</p>
<code>
<pre>
apache ALL=NOPASSWD: /etc/init.d/rc.piler reload
Defaults:%apache !requiretty
</pre>
</code>
<h3>I get the following warning in mysql logs:</h3>
<code>
<p>[Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE# SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave. Statement: REPLACE INTO sph_counter (counter_id, max_doc_id) SELECT 1, MAX(id) FROM sph_index</p>
</code>
<p>Try setting MIXED format for binlog_format in the my.cnf file</p>
<code>
<pre>
binlog_format = MIXED
</pre>
</code>
<h3>mysql -u piler -p piler < ./util/db-mysql.sql gives me the following error: ERROR 1071 (42000) at line 109: Specified key was too long; max key length is 1000 bytes”</h3>
<p>Make sure you have enabled the InnoDB storage engine, and try to create innodb tables (not myisam!).</p>
<h3>I have lots of email addresses (100+) in the search query, and the query in the mail log is truncated, not ending with LIMIT xx,yy</h3>
<p>Increase the thread_stack variable, and restart the searchd daemon. Be sure to read <a href="https://sphinxsearch.com/docs/latest/conf-thread-stack.html" target="_blank">https://sphinxsearch.com/docs/latest/conf-thread-stack.html</a>!</p>
<h3>After importing EML files extracted from a PST file, some emails have “From: 'Some Name' <MAILER-DEAMON>” (or the To: field is “redacted”) in the email header.</h3>
<p>It's not a piler bug, the problematic emails have bogus From and/or To headers. The solution is to fix them BEFORE importing. If you realise it only after importing, then purge the archive (if you can afford this) (see the FAQ item above on how), fix the emails, and import them again.</p>
<h3>How much memory sphinx requires</h3>
<p>Connect to sphinx, then run the following statements:</p>
<code>
<pre>
show index main1 status like 'ram_bytes';
show index dailydelta1 status like 'ram_bytes';
show index delta1 status like 'ram_bytes';
</pre>
</code>
<p>See the explantion at <a href="https://sphinxsearch.com/blog/2011/11/11/sphinx-memory-consumption/" target="_blank">https://sphinxsearch.com/blog/2011/11/11/sphinx-memory-consumption/</a>.</p>