-
Notifications
You must be signed in to change notification settings - Fork 36
/
Changelog
838 lines (688 loc) · 40.4 KB
/
Changelog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
~~~ Version 2.5
Improvements include (in no particular order):
* Many updates to documentation and man pages
* The YAML library is now included within the PLFS library
This means it is no longer necessary to link in the YAML library. It is only
necessary to link in libplfs when building against PLFS.
* PLFS API functions now return a consistent set of return types
The new return types are enumerated in plfs_error.h which is now part of a
PLFS installation. Plfs.h includes plfs_error.h, so it is not necessary to
include it separately.
* PLFS adminstration tools have been moved to sbin
Dcon, findmesgbuf, plfs_remove and plfs_query now reside in <prefix>/sbin.
* New analysis tool in analysis subdirectory
A new analysis tool has been created for analyzing PLFS files and performance.
Please see the documentation in that directory for more information.
* Removed outdated scripts and scripts directory
The scripts directory was home to several scripts that are no longer usable.
In many cases, they were not needed anymore.
Bug fixes:
* N-1 MPI/IO doesn't use multiple backends
Issue #57: a bug was identified where files were not distributed evenly over
backends when using N-1. The distribution of backend use has been optimized.
* Transport Endpoint Disconnected
Issue #166: due to race conditions and non-thread safe classes, it was
possible to have the PLFS daemon seg fault and crash without FUSE knowing.
Then, when FUSE goes to access anything within the PLFS mount point, an error
would occur. The error was "Transpoint Endpoint Disconnected". Thread-safety
has been increased and FUSE stability has been increased.
* plfs_query on non-container
Issue #190: plfs_query now does not error out when running on a flat-file
(N-N) mountpoint.
* O_EXCL in container mode
Issue #249: poor performance was found when PLFS was told to open a file using
O_EXCL. The performance has been improved.
* PLFS and random offsets
Issue #296: PLFS was found to use calls to rand() not uniformly across all
ranks. This resulted in MPI processes being out of sync with each other in
relation to randomly generated numbers. A thread-safe wrapper has been created
and PLFS no longer interferes with running processes' random numbers.
* Free buffer used in ADIO patch set
Issue #297: Some buffers used in ad_plfs were not being properly freed. These
buffers are now properly freed which helps with memory pressure.
* Update version stamps for droppings and .plfsdebug
Issue #317: Various locations that provide PLFS version information were not
doing so consistently. Some of this had to do with the transition from SVN to
GIT. Now, the process of getting the PLFS version is consistent and the
version of PLFS that created a file is contained in that file.
* EPERM from calling setgroups() with PLFS running as non-root
Issue #318: PLFS was found to be calling setgroups() regardless of whether or
not PLFS was running as root. Now, setgroups() is only called if PLFS is
running as root.
~~~ Version 2.4
PLFS 2.4 introduces many new features and enhancements in addition to the bug
fixes listed further below. Improvements include (in no particular order):
* Three new IOStore protocols
FUSE, HDFS, and PVFS are now supported natively by PLFS.
* CMake build system
CMake has replaced Autotools as the build system for faster and easier
configuration and compilation. Documentation on the new build process is in
README.install in the root of the source tree.
* New plfsrc config file format
PLFS configuration has moved from the old custom format to a new YAML 1.2
compliant specification that allows for easier configuration and management.
Documentation on the new format is listed in the plfsrc man page located in
man/man5/plfsrc.5in (or "man plfsrc" if you've installed PLFS 2.4 to somewhere
in your PATH).
* GLib IOStore
The GLib IOStore has been updated to provide a configurable buffer size that
improves small write performance dramatically on many systems.
* 1-N small file mode
The new "Small File Mode" for 1-N writing dramatically improves file creation
performance. Combined with the Glib IOStore buffering, 1-N mode essentially
allows for bandwidth-constrained file creation performance rather than
metadata-constrained.
* Speed improvements
Various code-corrections and optimizations improve performance for many
workloads in N-1, N-N, and 1-N modes.
* ADIO cleanup
The ADIO layer has been cleaned up for consistency. This change requires a
rebuild of MPI with the new ad_plfs code.
* OpenMPI 1.7.x support
OpenMPI 1.7.x is now supported. Patch notes are available in
mpi_adio/README.patching.
* Index compression option
Contiguous index entry compression can be disabled if desired, see the plfsrc
man page for details.
* Intel/PGI compilers
Intel and PGI compilers are now usable, but not extensively tested.
* MPIIO updates
A new MPIIO hint has been added to allow "uniform restart" mode to reduce
memory usage when reading back a file with the same number of processors as it
was written with. Details are in mpi_adio/ad_plfs/ad_plfs_hints.c.
* Werror and Wshadow added to default build
In order to help keep code to the guidelines outlined in README.developer,
Werror and Wshadow are now used by default in the build process. Code cleanup
resulted from this change.
* ADIO support for newer MPICH
PLFS-prep patches now exist for MPICH2-1.5 and MPICH-3.0.3. With the release
of MPICH2-1.5, the MPICH folks completely redid their build system. The
make_ad_plfs_patch script has been updated to relfect this change. This means
that versions before MPICH2-1.5 can no longer be completely patched by using
make_ad_plfs_patch. Details are in mpi_adio/README.patching.
Bug Fixes:
* PLFS Segfault Without PLFSRC
Issue #69: A call to any PLFS library function without a valid PLFSRC file
available results in a segfault. Now the functions note that there is no
PLFSRC file and exit.
* Read Returns Zero Bytes
Issue #73: The issue has a reproducer that would cause the problem for 32-bit
architectures. We aren’t aware of anyone using 32-bit architectures, and it
doesn’t repeat for 64-bit architectures. We have added the reproducer to the
regression suite.
* PLFS Daemon Crashes Logged
Issue #163: There have been occasions where the PLFS daemon (a FUSE-based
application daemon) crashes and we don’t know why. All crashes are trapped and
as much information as is available is logged. This should help discover the
sources of crashes, should they occur in the future.
* PLFS Library Removes “plfs:” MPI/IO Prefix
Issue #172: The path names weren’t handled consistently. The PLFS library
doesn’t need the “plfs:” prefix. That is currently only needed by the MPI/IO
patch to tell MPI/IO that this is a PLFS target and to route the calls to
PLFS. Now all PLFS library routines strip the “plfs:” from the file
specification. Additionally, a new function, is_plfs_path, can take any path
name and return True (1) or False (0).
*Rename Test Failure
Issue #174: Mount points with multiple backends would fail to rename a file
using the POSIX interface. This error is fixed. Renaming works in all cases
and the regression test suite tests all cases to ensure it continues to work.
* Defer Data/Index File Creation Until Written
Issues #177, #188: For efficiency, we don’t create files until they have data
written to them. The PLFSRC file can define “lazy_droppings 0” to disable this
and have all files created when they are created, whether or not any data is
ever written to them.
* Latest Data for Small File Mode Not Available
Issue #178: When testing where certain processes created files and others
stat’d them, it was discovered that the files weren’t flushed to storage on
close. This was part of the design that is not POSIX standard-compliant to
make small files efficient. plfs_flush_writes( char *path ) and
plfs_invalidate_read_cache( char *path ) were implemented so that users could
force flushing data to storage and getting the latest data from storage when
the application knows this is required.
* Implement MPI/IO Hint for Uniform Restart
Issue #179: Uniform Restart assumes that a file will be read by the same
number of processes as were used to write it. This allows each process to read
only the section of the index for that process, thus not filling memory with
other processes’ part of the index. This is useful for very large files.
However, we want to be able to choose when to do this or not with the same
PLFS version. So, there is now a hint that is passed to MPI/IO telling it
whether or not to assume uniform restart.
* Corrected Documentation and Cleaned-Up PLFS API
Issue #185: Several functions listed in plfs.h were actually implemented in
internal C/C++ source files, not in plfs.cpp. Those function specifications
were moved to the appropriate include files. plfs.h only contains functions
that are intended for PLFS client users to call. Man pages were updated to
document all plfs.h functions. Man pages were also updated to document the new
YAML parsing of the PLFSRC file and CMake use for build.
* Include File config.h Error
Issue #186: config.h was included in plfs.h, causing compile/link time
warnings. plfs.h was changed so that the cmake process determines the macros
that need to be there when the build is done, based on configuration
parameters.
* Don’t Create Zero-Length Files
Issue #188: Doing the actual creation of files that are zero-length incurs an
overhead burden that is not necessary. The implementation recognizes their
existence without actually creating the underlying physical representation if
there is no data to store.
~~~ Version 2.3
* IOStore
Issue #120: IOStore is the virtual layer that acts as an interface to physical
storage. It adds no new functionality in this release, but will be utilized in
future releases. For a working example, look at the POSIX IOStore in src.
* File Descriptor Limitation
Issue #114: There is a limit to the number of files that a process may have
open at any given time. This limit is determined with this command:
% limit descriptors
descriptors 1024
Some systems will allow you to change your limit by specifying a new limit.
% limit descriptors 2048
% limit descriptors
descriptors 2048
Other systems will require a privileged user to do this. On the large system
at LANL, Cielo, the System Admins set the limit for a select user group on
ci-mesh to a higher number so that serial inspection applications that need to
open all files in a set could do so without error.
To close this issue, we still need to implement a cache mechanism so that the
PLFS library can live within any limit.
* Correct plfsrc man pages
Issue #121: Some of the descriptions of the plfsrc parameters in the man page
were incorrect. For example, "statfs" did not explain that it must follow a
"mount_point" directive and only modifies the "mount_point" directive that it
follows. Parameter explanations were clarified and corrected.
* Call "readdir" and all PLFS API Calls without FUSE Mount
Issue #123: We need to be able to call "readdir" and all PLFS library API
calls without requiring a FUSE mount to be present. The PLFS library was
intended to be used by applications and not require an operating system file
system mount to be active. Rather, the PLFS library can use the PLFSRC file to
determine the validity and backend storage of a PLFS file system.
# Data Sieving and O_RDWR in ad_plfs
Issue #124: We never want data sieving turned on for PLFS. Data sieving
requires O_RDWR, but O_RDWR badly affects PLFS performance. The ad_plfs suite
of files was fixed for Cray's MPI and OpenMPI such that data sieving is
always off and that it does not allow O_RDWR to be used to open a file.
# Removal of weak symbols in ad_plfs.c
Issue #134: Weak symbols are now surrounded by C directives. By default, users
will not see them. Cray developers will be able to use them as the directives
are Cray-specific.
~~~ Version 2.2.3
* Cleaned-up MPI/IO Patching Process
Ticket #87 (from old Sourceforge.net repository): Cleaned-up files and added
necessary logic to handle patching for MPICH2 and OpenMPI 1.4.x and 1.6.
* Changing N-to-N Directory Permissions Broken
Ticket #84: Ensured that the files under a directory had their ownership and
permissions changed to match the directory that contains them.
* Made mlog More Clear with Respect to Types
Ticket #95: Made some changes to mlog and type casting to make it type-correct.
* Fixed N-to-N Overwrite Pattern on N-to-1 Mount Point
Ticket #96: N-to-N files were not reporting the correct size when they were
overwritten using an N-to-1 mount point. They now report the correct sizes
after each overwrite.
* Fixed N-to-1 (Container Mode) Permission Bit Setup
Ticket #97: PLFS represents a file in this mode as a combination of
directories and files. The permissions on the directories, in particular,
have to have certain bits set in order to allow the specified access to
the underlying files that represent the PLFS file.
* Fixed Makefile.am for OpenMPI 1.6.x Patch
Ticket #98: 'make dist' didn't work for OpenMPI 1.6.x. We fixed
the Makefile.am so that it does work.
* Fixed Unlink with New Permission Scheme
Ticket #99: After implementing the new permissions scheme for PLFS files,
we fixed unlink so that it worked correctly.
* Transferred Permissions Fixes to Parallel Operations Branch
Ticket #100: We are working on making certain file operations parallel.
That is in another branch. We wanted to make sure that all the permissions
work was transferred to that branch so that the permissions work persists in
future releases.
* Fixed Rename in Container Mode
Ticket #101: Renaming files in container mode was not working. It is now.
* Fixed Handling of Truncation on File Creation
Tickets #103, 104, and 107: There were some issues with files being incorrectly
truncated when they were created. MPI and POSIX do things differently. It now
behaves correctly for all cases.
* Use plfs_access instead of lstat
Ticket #106: In the MPI/IO interface we were using lstat. We are using
plfs_access instead.
* Correctly Handle Use of MPI_MODE_EXCL
Ticket #108: Fixed the MPI/IO interface to correctly return an error when
MPI_MODE_EXCL is used and the file exists.
~~~ Version 2.2.2
* Ticket #86: Fixed an issue in ad_plfs/ad_plfs.c so that Open MPI v1.6 can be
patched. Updated supporting patches and patching scripts to be able to work
with more than one version of Open MPI.
~~~ Version 2.2.1
* Fixed an issue in ad_plfs/ad_plfs_fcntl.c where a non-collective call to
MPI_File_get_size() would hang and could crash on a subsequent close call from
another MPI rank.
~~~ Version 2.2
* Error Copying Large File Merges Contiguous Index Entries
Tickets #7 & #57: This problem was noticed copying a ~340 GB restart dump file
from a non-PLFS file system to a PLFS file system. It resulted in an error
exit code 134 opening the file to read. This was because the PLFS index file
was larger than the memory available to hold it. There was an outstanding
ticket (#7) to merge contiguous index entries for scalability. This was the
answer to this problem. An index compression algorithm was implemented so that
contiguous index entries are merged by O(1K). The compression factor can be
changed as files get orders of magnitude larger.
* Fixed Misleading Variable Names in WriteFile.cpp
Ticket #19: PLFS has the concept of logical and physical files. Some of the
variables in WriteFile.cpp were misleading about what was logical and what was
physical. This made the code confusing to developers. The merge from EMC
engineering that added a LogicalFD class and moved all the code into
ContainerFD and added FlatFileFD cleaned all this up. Now only the expandPath
code ever sees a logical path. All the internal container and flatfile code
and index code (which is part of container) see only physical paths.
* PLFS Tools Indicate Version
Tickets #28 & #42: There are a variety of utilities that come with PLFS. Each
now takes a "-version" flag that will return the version of PLFS with
which they were distributed. The PLFS library also has a function that returns
the version for applications that use the PLFS API directly.
* plfs_query Made More User-Friendly
Tickets #32, #52, & #55: This utility is intended primarily for system
administration and PLFS development personnel. It enables one to learn more
about the backend files that comprise a PLFS file or figure out to which PLFS
file one of the PLFS mount point backend files belongs.
* Fixed Code to Eliminate Comparison Warnings
Ticket #37: There were several warnings about comparisons involving signed and
unsigned values. Many of them were comparing to see if an unsigned integer was
less than zero (impossible). These were all fixed.
* Include Directive in PLFSRC Resulted in Default Values Being Used
Tickets #41 & #44: PLFS verision 2.1 introduced the ability to include files
as part of the PLFSRC file. A bug that reset required parameters to default
values every time a new file that was included was parsed was fixed. This was
most noticeable in that it forced include files to be put in a certain order
or values, specifically num_hostdirs, were changed back to incorrect default
values. The PLFSRC file parser now expands all include files and ensures that
resulting PLFSRC file, as a whole, is correct.
* plfs_check_config Makes Non-Existent Directories
Ticket #43: This utility is intended primarily for system administration and
PLFS development personnel. It enables one to check if the PLFS configuration
defined in the PLFSRC file is valid. The plfs_check_config utility will tell
the user when backend directories don't exist for a mount point. Using the
"-mkdir" switch will create those backends that don't exist.
* Race Condition for N-1 I/O through FUSE
Ticket #45: N-1 I/O through FUSE jobs would fail because the file could not be
found. This was a race condition between creation and when the file was
accessed. All tests that reproduced this error now work correctly. The PLFS
team does not recommend this I/O pattern through FUSE because it is a slow
pattern with additional overhead by FUSE. However, it now works for
completeness.
* Rewinddir Fixed
Ticket #47: The POSIX call, rewinddir, could not be used on PLFS. Instead, the
user had to closedir then opendir again. It turned out to be a C macro that
was the only line in an "if" statement's "then" clause. No
curly braces, {}, were used because it was one line. However, the macro
expanded to two statements and so the second statement was always executed,
even when it should not have been executed. We've changed our coding
standard to ask developers to always use curly braces, {}, irrespective of the
number of statements that follow (for loops, ifs, etc.).
* ADIO Interface Patched for Cray
Tickets #48 & 56: Fixed the ADIO interface and add some patches so that it can
be compiled out of the PLFS repository for use with Cray's xt-mpich2 MPI
without Cray having to make changes. Furthermore, all functions have either
"plfs" or "adplfs_", depending on where they are defined, prepended to them so
that they don't conflict with other libraries.
* Better Logging Capability Implemented
Ticket #53: PLFS used a custom plfs_debug logging capability. This was
transitioned to a more flexible one based on "mlog". Information on how to
configure the mlog-based logging is in README.mlog.
* plfs_init Ensures "init" Routines Called One Time
Ticket #59: There were several things that happened when PLFS gets
initialized. For a while some things were getting done more than one time and
that caused problems. The plfs_init routine has been fixed so that all the
initializations are done one time and there are no unintended side-effects to
multiple initializations.
* Error in ANL Test noncontig.c
Ticket #60: This bug was caused due to opening the file in Read-Write mode w/
truncate. There was a bug with trunc of an open file. Initially the bug was
that metadata droppings wouldn't rename properly, but this unveiled a separate
bug where physical offsets wouldn't get correctly updated. These are both
fixed. The PLFS team recommends against using O_RDWR. Please use O_RDONLY or
O_WRONLY. There are performance penalties for using O_RDWR.
* "Flat File" Mode for Better N-N Workload Performance
Ticket #64: The "Flat File" concept was developed and implemented for the N-N
workload. Any one file is written contiguously to a PLFS backend. Multiple
files can be written concurrently when a PLFS mount point has multiple
backends. This provides parallelism that is transparent to the user. There
are no indexes, containers, metalinks, etc. Thus, the PLFS overhead of
turning the N-1 pattern into N-N is eliminated. A mount point cannot support
both "Flat File" and "Container" mode. The same backend directory
specification can't be used in both "Flat File" and "Container" mode. A site
must handle this administratively so that the mount points are properly
declared to meet these criteria.
* Truncating an Open File Doesn't Work
Ticket #66: The metadata for a file that was truncated wasn't being
updated. So, the file had the new contents, but the old length. That made a
user of the file get the wrong size for the file and use it incorrectly. All
metadata is properly updated now so that the size of the file is correct
after it is truncated.
* PLFS Build within a PLFS Mount Point
Ticket #77: There was an error in the "./configure" script run that
indicated a problem building PLFS within a PLFS mount point managed by PLFS
2.2rc2. A metalink was not resolving to the correct physical location. This
bug was fixed and PLFS can now be correctly configured and built within a
PLFS mount point.
* Overwriting N-1 File with Less Data Shows Wrong Size
Ticket #78: When you overwrite a N-1 PLFS file with less data than it
originally had, it does not show the correct size. However, if you write it
with more data, it does show the correct size. There was an inconsistency in
how the metadata information was maintained. It was fixed and PLFS now
reports the correct file size in all cases.
~~~ Version 2.1
* Added include directive in plfsrc files. This way rc files can be modular
and combined. (Trac ticket #26)
* Fixed corruption of backends when removing a non-empty directory; symptom
was 'No such file or directory' from ls output:
ct-fe1 : pwd /plfs/scratch3/afn
ct-fe1 : ls -als
ls: cannot access blah: No such file or directory
total 32
16 drwxr-xr-x 2 afn eapdev 4096 Sep 10 01:04 .
16 drwxr-xr-x 11 root root 4096 Sep 6 18:08 ..
? ?????????? ? ? ? ? ? blah
(Trac ticket #30)
* Fixed rename through the FUSE interface. Adding multiple backends in 2.0
broke this functionality. (Trac ticket #17)
* added the sweet-ass FileOp.* classes
* updated chown/chmod to work w/ shadow containers
* fixed race condition bug in shadow subdirs being linked into canonical
* made tools like plfs_recover a bit more user friendly. the plfs lib
used to only work if you passed full logical paths which meant if your cwd
was in a plfs mount point, it wouldn't work with relative paths. Now it
does (but it does print a warning message in this case which might be spammed
a bit (ticket 43522)
* fixed bug with opening non-existent files for read through ADIO. also but
where you could turn a file into a directory by writing to it. (ticket 41436)
* fixed off-by-one error where subdirs were created btwn 1 and
plfsrc:num_hostdirs but adio assumed they were created btwn 0 and
plfsrc:num_hostdirs - 1. Now both count from 0 like computers should.
(ticket 42896)
* changed hash by path to hash by filename. No compatible with 2.0.1. But
plfs_recover can be used to fix this. This was necessary bec hash by path
would orphan children when parent directories were renamed.
* Now 'plfs:' prefix can be passed to all plfs functions.
It is used to direct MPIIO to use plfs and MPIIO in that case strips it
off. But users might get used to using it so if they ever do use it, then
it will work. Brett or someone commented that this might help prevent
confusion so give it a try. I hope it doesn't add a lot of overhead. It's
just one extra strncmp in the normal case where prefix isn't present
(ticket 41492)
* put better info (job size, BW) into the global_summary_dir droppings
* added tools/plfs_query which is just a tool for developers.
* missing plfsrc was assert(0). Now it's a bit more elegant (ticket 40866)
* added tools/plfs_version which queries all plfs mountpoints and reports
their versions. It also digs into a container and reports the version that
created a particular plfs file if a target is passed on the command line
* Fixed the bug where multiple slashes (e.g. /mnt///plfs) were not working
in the ADIO path. They were fine in the FUSE path because FUSE cleans up
paths for us but in the ADIO path, they were failing because we were just
using strcmp to test the path against the possible mount points. Now we
tokenize and compare each token one by one. Extra dots in a path
(e.g. /mnt/./plfs) still won't work in ADIO.
* Added README.developer
* added tools/plfs_recover which is designed for the case where a user
modified the plfsrc to change the set of backends for an existing mount point.
They shouldn't ever do this! But if they do, what will happen is that they
will see their file on a readdir() but on a stat() they'll either get ENOENT
because there is nothing at the new canonical location, or they'll see a
shadow container which looks like a directory to them. So this function makes
it so that a plfs file that had a different previous canonical location is now
recovered to the new canonical location. It also works on directories. It
also works across file systems (i.e. it doesn't use rename).
~~~ Version 2.0.1
* Reduced container overhead by 4 files and 2 directories: combined access
and creator, removed version dir and combined all 3 version files into one,
combined openhosts and meta dir
* Fix bug in bitmap in ad_plfs_open where some compilers don't correctly
do bit's. Now we do bitmaps ourselves instead of relying on the compiler.
~~~ Version 2.0
* Got the distributed metadata hashing working. Removed map from plfsrc
~~~ Version 1.1.9
* got the addMeta to drop a better summary file in global_summary_dir
* added parallel index read for read open() through ADIO. This avoids
the horrible N^2 index read problem that we encounter through FUSE.
We might eventually need to figure out how to address this in FUSE, but
for the time being, our current usage model only does parallel IO through
ADIO so this is sufficient. Rank 0 reads the subdirs, and farms them
out to subdir leaders. Each of those reads the subdirs, and farms out
the individual index files to their subdir workers. Then all the
individual index files are read in parallel, compressed up through
subdir leaders, compressed again to rank 0 who then broadcasts it out
to all. This removes ALL redundant activity at the file system by
transferring a bit more work to the much faster interconnect network.
* improved the documentation a bit about building plfs-patched MPI's
* automated the patch generation for one of our two patches
* removed a few deprecated calls deep within the Index class (startBuffering)
~~~ Version 1.1.8
* Fixed the getattr bug (we were just getting the attr from the openfile
handle and not getting it from the Container as well). This made the size
reasonable but the permissions/ownership/everthing else were garbage).
This also required a bit of smarts. This was really slowing things down
in lanl's fs_test program which does a file stat before the close and this
was making rank 0 descend the entire container structure and build an index
in order to figure out the size. Now we added a lazy flag and ad_plfs_fcntl
passes the lazy flag and it gets the size from the open file handle (this
won't be perfectly accurate but it will show incremental progress and won't
be expensive to query).
* fixed ADIO parse conf error. Now a good error is reported when the
conf file is bad or the file path is bad.
* added global_summary_dir to the plfsrc
* fixed tools/plfs_flatten_index which broke due to the stop_buffering
thing we put into the Index to stop buffering when memory is exhausted.
the buffering is so everyone retains their write index in memory to do
the flatten on close in ADIO
~~~ Version 1.1.7
Fixed a bug in rename.
~~~ Version 1.1.6
Added support for a statfs override in the plfsrc file
in response to ticket 35609.
~~~ Version 1.1.5
Bug fix for symbolic links. I swear this is the second time I fixed this bug
~~~ Version 1.1.4
Bug fix in the multiple mount point parsing (unitialized string pointer)
~~~ Version 1.1.3
Added the multiple mount point parsing in plfsrc
~~~ Version 1.1.2
Index flattening in ADIO close for write
~~~ Version 1.1.1
Index broadcast in ADIO open for read
~~~ Version 1.0.0.0pre.0.C
No new features or performance engineering. Just bug fixes.
All the bug fixes we've made since the last release:
31207 unlink non-atomic right now resolved plfs adamm 0
johnbent@lanl.gov 2 weeks ago 8 days ago 0
- We now rename the container to a hidden path before unlinking since
the recursive unlink over a container is not atomic.
31236 reproducible rename bug resolved plfs Nobody 0
johnbent@lanl.gov 2 weeks ago 2 weeks ago 0
- Simple bug in rename. PLFS has to remove a container in the destination
spot before doing a rename since the rename system call can clobber an
existing file but not an existing directory. So if the user wants to rename
over an existing PLFS file, we have to remove the container first. We had
a bug where we were removing the src instead of the dest.
31253 plfsrc improvement resolved plfs Nobody 0
adamm@lanl.gov 2 weeks ago 8 days ago 0
- Make it so that if there is no map defined in the plfsrc file, it
defaults to:
map MNT:BACK
31346 Open files on rename resolved plfs johnbent 0
adamm@lanl.gov 2 weeks ago 2 weeks ago 8 days ago 0
- The problem was that the rename was creating an open file in the open_files
cache with the relative name but the open_files cache should be for absolute
names. Then when the remove file was asking the open_files cache to remove it,
it wasn't found (i.e. absolute != relative). So now rename correctly inserts
the absolute path into the open_files cache.
31348 symlink not working resolved plfs Nobody 0
johnbent@lanl.gov 2 weeks ago 2 weeks ago 0
- We now handle sym links correctly. We just stash the logical front-end path
into the backend symlink. Then it just works. On the plfs_getattr though, we
now have to do two stats: one of the entry to see if it's a symlink, ENONENT,
a file, or a directory. If it's a directory, we then have to check for the
access file.
31367 weird bug with relative paths in f_symlink (FUSE problem?) resolved plfs Nobody 0
johnbent@lanl.gov 2 weeks ago 9 days ago 0
- This was nothing. I had forgotten that symlinks can actually be relative.
31375 can't remove a read-only file resolved plfs Nobody 0
johnbent@lanl.gov 13 days ago 9 days ago 0
- fixed this by making dirMode() always make files owner-writable because
a file can only be deleted in the parent dir is writable. this means we
can only remove a container if the top-level dir is writable. this
means that a user can't protect file from themselves writing it. we can
live with that.
31421 svn co times out resolved plfs Nobody 0
adamm@lanl.gov 9 days ago 40 hours ago 0
- This was a problem with O_RDWR files. We started creating the index on opens
for reads to check for permissions problems. However, for O_RDONLY, we can
cache this index, but not for O_RDWR. We were however incorrectly caching it
for O_RDWR and this was causing problems when reads wouldn't get newly written
data.
31424 weird issue with using an executable made in a plfs mountpoint on centos resolved plfs samuel 0
adamm@lanl.gov 9 days ago 9 days ago 8 days ago 0
- Something weird where we built on centos:
We build plfs itself in a plfs mount point. We then try to run that
plfs executable and get this error:
cannot restore segment prot after reloc: Permission denied
we then do this:
/usr/sbin/setenforce 0
And it works. I think this is not a big deal, we know a workaround, and
it might actually work better on CentOS since we fixed some other bugs that
were causing build issues on other distributions. So this might be fixed now
and if not, there's an easy workaround.
31427 can't turn off debugging info resolved plfs johnbent 0
adamm@lanl.gov 9 days ago 8 days ago 0
- Silly issues with setting DEFINES in the configuration code.
31587 umask breaking our 777 dropping thing resolved plfs Nobody 0
johnbent@lanl.gov 3 days ago 3 days ago 0
- We need to set droppings to 777 so that they can also be modified. Otherwise,
some other user can write to a file and create droppings and then the original
owner can't remove the file. But the umask was sometimes getting applied as
well and the droppings weren't always 777. So now we set the umask to 0 in
WriteFile::openFile (and then restore it).
31616 truncate bug resolved plfs Nobody 0
johnbent@lanl.gov 3 days ago 40 hours ago 0
- We haven't seen this in a long time. We think it got magically fixed by some
other bug fixes.
31639 rmdir error seen on cuda resolved plfs Nobody 0
johnbent@lanl.gov 2 days ago 47 hours ago 0
- This was strange and annoying since it was old code that had never had
problems before on other systems. But then something about FUSE on cuda was
different.The problem was that on opendir, we were doing the actual readdir and
caching it. Then on readdir, we'd return the cached dirents. But what was
happening was:
opendir
readdir from offset 0 to end
unlink entry
readdir from offset 0 to end
stat entry
ENOENT
So now we remove the dirents after we do a complete readdir to end. Then if a
new readdir comes in, we refetch the dirents. Therefore, we get the newly
revised set of dirents following the unlink so we don't return a deleted entry
on the readdir.
31656 permission error on open resolved plfs Nobody 0
johnbent@lanl.gov 2 days ago 22 hours ago 0
- This was a long weird one but the resolution was that Gary had code that
was calling open with O_CREAT but no mode and it was resulting in very weird
permissions. This is fine and that's what happens on other file systems is that
the file gets created with weird permission bits. But PLFS was failing to
create the file bec it was putting the weird permission bits on the container
and was then unable to create the droppings. We just weren't adding S_IRUSR in
the dirMode() call.
31704 rename of an open file causes segfault on next read of .plfsdebug resolved plfs Nobody 0
johnbent@lanl.gov 42 hours ago 40 hours ago 0
- This was related to the other rename of an open file. The rename was
removing the plfs fd but not removing the entry in the open_files cache. Then
when we iterated down it, we dereferenced a dangling pointer. The fix was
making sure that we keep the set of open files consistent with the open_files
cache. We screwed this up by inserting a relative path and then trying to
remove an absolute one.
31720 can't build plfs in plfs johnbent@lanl.gov 40 hours ago 14 hours ago 0
- This is fixed now. The problem was that truncate wasn't cleaning up meta
droppings so the stat was wrong and reads that pre-allocated a buffer
according to the stat were finding garbage in the buffer. It works now for
truncates to 0 bec it just unlinks all the metadata droppings. However, I
wonder if on this case, we should probably set the mtime of the access file.
It doesn't yet work for truncates to the middle of the file. For these, we have
to selectively unlink/modify metadata droppings.
~~~ Version 1.0pre.0
1) Problems with building the PLFS SC09 paper. Problems are in data/summary
directory. When make does ./get_data > data, it was failing due to a bad
reference count because we were incrementing the writers when a child came
in and used the parent's fd. Then when the file was closed, it asserted since
the reference count was still 1. So now we don't increment the writers for
children but now we're still seeing this other strange behavior:
jvpn-132-65:/tmp/plfs_mnt/SC09/data/summary>./get_data >! data
cat: stdout: Result too large
2) plfs_read is now multi-threaded when a logical read spans multiple physical
data chunks. In addition, if these chunks need to be opened that is done in
the threads as well so we parallelize both the opens and the reads
3) Container::populateIndex is now multi-threaded when the container has
more than one index file. This means of course that overlapped writes are
now non-deterministically handled in addition to being undefined as they
were previously.
4) Moved the mapping from a logical file to a physical path into the plfs
library and out of fuse. The mapping consults a plfsrc file in order to
know how to distribute users across multiple backends.
~~~ Version 0.1.6
1) Gary found a bug in the old version running on cuda.lanl.gov. I tried to
reproduce it but found it was no longer in 0.1.5. The test was just doing
a bunch of dd's with increasing offsets and then reading from another node.
Just sure what was causing that bug. Didn't go back and debug old versions.
But while investigating that bug, I found another bug that was caused when
dd would truncate a file and the index file trunctated would be truncated
at the first entry and then would be empty. But I was assuming that it
wasn't empty so I was checking the entry before the truncate point to see
if it needed to be partially truncated. But when there was no entry, then
this was tromping memory.
~~~ Version 0.1.5
1) Fixed a bug where temp dirs used to attempt to create the container were
not being deleted when some other node created the container first. The
problem was that we changed from SUID to accessfile to identify a container
and we weren't removing the accessfile in the temporary container before
trying to unlink the temporary container.
2) Added a container/version/$(TAG_VERSION) dropping so in the future we can
make sure to be backward compatible.
~~~ Version 0.1.5
1) Changed the behavior of truncate to not remove droppings but just to truncate
them. This was needed because we discovered that if one proc already has open
handles to droppings, another can do a truncate, and then those droppings
disappear when the handles are closed. This was discovered using the qio
tests which don't do barriers after open and closes.
2) Fixed reference counting bugs in the Plfs_fd structure. Again this was
discovered due to the lack of barriers in the qio tests. When we had
simultaneous readers and writers in a Plfs_fd structure, some of the closes
were screwing up the reference counting, and other thread's Plfs_fd structures
were being removed out from under them.
3) cvs co was causing problems because it was renaming open files. I thought
I had fixed that before but it cropped up again. Now in the rename, when we
are renaming an open file, we remove the old open file, we change it's
internal names, and we add it back again as a new open file. There's a bunch
of comments in plfs_fuse.cpp:f_rename about this as well.
4) Fixed a bug that was caused by layering PLFS's (e.g. running PLFS-MPI onto
a PLFS-MNT) which was causing this problem:
a) the app creates file 'foo'
b) PLFS-MPI does mkdir 'foo' on PLFS-MNT
c) PLFS-MNT does mkdir 'foo' on PLFS-BACK
d) PLFS-MPI touches 'foo/access' on PLFS-MNT
e) PLFS-MNT does mkdir 'foo/access' on PLFS-BACK
f) PLFS-MPI tries to access directory 'foo'
g) PLFS-MNT thinks 'foo' is a container since it has an access entry
h) PLFS-MPI gets confused bec what was a directory is now a file
Also, this revealed a bug where PLFS library thought that mmap returned NULL
on error, but really it returns MAP_FAIL, so PLFS library didn't notice when
mmap failed and this caused a segfault. To fix this, the check for the access
file checks both existence and whether it is a file. Now when the PLFS-MPI
does a touch of 'foo/access,' the PLFS-MNT makes a 'foo/access' directory, so
when PLFS-MPI looks at 'foo,' PLFS-MNT doesn't find a 'foo/access' file, so it
treats 'foo' as a directory which is the proper behavior in this confusing
layered case. This also revealed the worse problem that a user can never
create an 'access' entry in a directory because PLFS will start treating that
directory as a container and when it doesn't have the other entries that PLFS
expects in a container, it gets all weird. So to fix this, we have broken
backwards compatibility by changing the name of the access file from 'access'
to '.plfsaccess081173' At some point, we need to add backward compatibility
checking.
5) Fixed it so that the plfs_map in the trace/ directory is correctly built
again. I tried to change it so that it builds from the library only but it
access the low-level index calls so that doesn't work. So now it builds by
just using all the src/*.o files instead of just trying to link with the
library alone.