Skip to content
This repository has been archived by the owner on Jan 31, 2020. It is now read-only.

Commit

Permalink
Rework how partition files are discovered and PartIO is implemented
Browse files Browse the repository at this point in the history
This actually breaks the RepoIO implementation somewhat, which also needs reworking.
  • Loading branch information
dhardy committed May 5, 2016
1 parent 3fc332e commit ea988f0
Show file tree
Hide file tree
Showing 10 changed files with 497 additions and 476 deletions.
46 changes: 26 additions & 20 deletions doc/repo-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,56 +15,55 @@ objects which only need provide access to "file-like" data streams somewhere.
We consider all paths relative to the repository's root directory.


Potential changes
-----------------------

The number section could change, e.g. to `p2-s5-l1` instead of `pn2-ss5-cl1`,
or even lose the `-` separators.

The partition number part (`pnN`) is present as a compromise, letting the
software more easily discover partitions and their base-names (`BASENAME` part)
without requiring that the user-provided part of the software be able to guess
or remember (via the header in every snapshot) relationships between numbers
and base-names. It is not strictly necessary and could be removed.


Partitions
Partition file names
---------------

A 'partition' does not need to be part of a repository, but can also be used
on its own (excuse the misleading name). Either way, its state and history is
recorded via one or more snapshot files and any number of log files, each
associated with one snapshot file.

Snapshot file names should take the form:
All partition files should be in the same directory.

BASENAMEssS.pip
Snapshot file names should take one of the forms:

ssS.pip
BASENAME-ssS.pip

and commit log file names:

BASENAMEssS-clL.piplog
ssS-clL.piplog
BASENAME-ssS-clL.piplog

where `BASENAME` can be any substring of a path (including `/` path separators)
or nothing at all, and `S` and `L` are numbers (both non-negative integers
without leading zeros).

For example, `BASENAME` might be `addressbook-` leading to file names like
For example, `BASENAME` might be `addressbook` leading to file names like

addressbook-ss1.pip
addressbook-ss1-cl1.piplog
addressbook-ss1-cl2.piplog
addressbook-ss2.pip
addressbook-ss2-cl1.piplog

`BASENAME` may end with `pnN` as with repositories (below), e.g. `example-pn5`.

Sometimes a partition's files are found via a *prefix* which is a path relative
to the repository's root directory followed by `BASENAME` and `-`; for example
if the above addressbook files are in a subdirectory `a`, the prefix would be
`a/addressbook-`.


Repositories
-----------------

A repository is a set of (at least one) partition(s). It has no "master file"
or any other storage outside of the partition files.

File names are as set out above for partitions except that `BASENAME` must end
`pnN` where `N` is a partition number.
File names are as set out above for partitions though usually `BASENAME` ends
with `pnN` where `N` is a partition number (this avoids the need to read a
partition's file to find the partition number).

For example, a repository could have the following files:

Expand All @@ -81,3 +80,10 @@ For example, a repository could have the following files:
archives/2016-pn22-ss1-cl1.piplog
archives/2016-pn22-ss1-cl2.piplog
archives/2016-pn22-ss1-cl3.piplog

Repository file discovery should start from a top directory and optionally work
recursively. If not recursive, only `*.pip` and `*.piplog` files in the top
directory will be discovered; if recursive sub-directories will also be
checked. Partition files may be in any sub-directory *however* for each partition,
all files must be in the same directory. If this is not the case discovery may
fail or continue while warning that some files may be missed.
31 changes: 18 additions & 13 deletions examples/pippincmd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

//! Command-line UI for Pippin
#![feature(box_syntax)]
#![feature(str_char)]

extern crate pippin;
Expand All @@ -21,8 +20,8 @@ use std::ffi::OsStr;
use std::os::unix::ffi::OsStrExt;
use std::rc::Rc;
use docopt::Docopt;
use pippin::{Partition, PartIO, ElementT, State, MutState, UserData};
use pippin::discover::*;
use pippin::{Partition, PartIO, ElementT, State, MutState, UserData, PartId};
use pippin::{discover, fileio};
use pippin::error::{Result, PathError, ErrorTrait};
use pippin::util::rtrim;

Expand Down Expand Up @@ -187,8 +186,10 @@ fn inner(files: Vec<String>, op: Operation, args: Rest) -> Result<()>
name[0..len].to_string()
});

let io = try!(DiscoverPartFiles::from_dir_basename(path, &name, None));
try!(Partition::<DataElt>::create(box io, &repo_name,
let prefix = path.join(name);
let part_id = PartId::from_num(1); //NOTE: could be configurable
let io = fileio::PartFileIO::new_empty(part_id, prefix);
try!(Partition::<DataElt>::create(Box::new(io), &repo_name,
vec![UserData::Text("by pippincmd".to_string())].into()));
Ok(())
},
Expand All @@ -200,8 +201,12 @@ fn inner(files: Vec<String>, op: Operation, args: Rest) -> Result<()>
panic!("No support for -c / --commit option");
}
println!("Scanning files ...");
//TODO: verify all files belong to the same partition `args.part`
let discover = try!(DiscoverPartFiles::from_paths(paths, None));
assert!(paths.len() > 0);
if paths.len() > 1 {
//TODO: change CLI maybe?
println!("Warning: using first path to find partition, ignoring others");
}
let discover = try!(discover::part_from_path(&paths[0], None));

if let PartitionOp::List(list_snapshots, list_logs, list_commits) = part_op {
println!("ss_len: {}", discover.ss_len());
Expand All @@ -218,7 +223,7 @@ fn inner(files: Vec<String>, op: Operation, args: Rest) -> Result<()>
}
}
if list_commits {
let mut part = try!(Partition::<DataElt>::open(box discover));
let mut part = try!(Partition::<DataElt>::open(Box::new(discover)));
try!(part.load(true));
//TODO: ideally commits should be sorted before printing.
// I'm not sure whether the sorting should happen here or
Expand All @@ -231,7 +236,7 @@ fn inner(files: Vec<String>, op: Operation, args: Rest) -> Result<()>
}
Ok(())
} else {
let mut part = try!(Partition::<DataElt>::open(box discover));
let mut part = try!(Partition::<DataElt>::open(Box::new(discover)));
{
let (is_tip, mut state) = if let Some(ss) = args.commit {
try!(part.load(true));
Expand Down Expand Up @@ -324,9 +329,9 @@ fn inner(files: Vec<String>, op: Operation, args: Rest) -> Result<()>
assert_eq!(args.commit, None);
println!("Scanning files ...");
// #0017: this should print warnings generated in discover::*
let discover = try!(DiscoverRepoFiles::from_paths(paths));
let discover = try!(discover::RepoFileIO::from_paths(paths));
for part in discover.partitions() {
println!("Partition {}: {}/{}*", part.part_id(), part.dir().display(), part.basename());
println!("Partition {}: {}*", part.part_id(), part.path_prefix().display());
}
Ok(())
},
Expand Down Expand Up @@ -385,10 +390,10 @@ pub struct CmdFailed {
impl CmdFailed {
/// Create an "external command" error.
pub fn err<S, T: fmt::Display>(cmd: T, status: Option<i32>) -> Result<S> {
Err(box CmdFailed{ msg: match status {
Err(Box::new(CmdFailed{ msg: match status {
Some(code) => format!("external command failed with status {}: {}", code, cmd),
None => format!("external command failed (interrupted): {}", cmd),
}})
}}))
}
}
impl fmt::Display for CmdFailed {
Expand Down
53 changes: 21 additions & 32 deletions examples/sequences.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,8 @@ extern crate log;
extern crate env_logger;

use std::io::Write;
use std::path::{Path, PathBuf};
use std::path::{Path};
use std::process::exit;
use std::fs;
use std::cmp::min;
use std::u32;
use std::collections::hash_map::{HashMap, Entry};
Expand All @@ -25,7 +24,7 @@ use rand::Rng;
use rand::distributions::{IndependentSample, Range, Normal, LogNormal};

use pippin::{ElementT, PartId, Partition, State, MutState, PartIO, PartState};
use pippin::discover::*;
use pippin::{discover, fileio};
use pippin::repo::*;
use pippin::merge::*;
use pippin::error::{Result, OtherError};
Expand Down Expand Up @@ -333,16 +332,16 @@ impl Generator for Power {
// ————— main —————
const USAGE: &'static str = "
Generates or reads a database of sequences.
In repository mode, PATH should be a directory. In single-partition mode, PATH
may be either a directory or a data file.
Usage:
sequences [options]
sequences [options] PATH
Options:
-h --help Show this message.
-d --directory DIR Specify the directory to read/write the repository
-p --partition BASE Instead of a whole repository, just use the partition
with basename BASE (e.g. the basename of
'xyz-pn1-ss0.pip' is 'xyz-pn1').
-p --partition PN Create or read only a partition number PN. Use PN=0 for
auto-detection but to still use single-partition mode.
-c --create Create a new repository
-s --snapshot Force creation of snapshot at end
-g --generate NUM Generate NUM new sequences and add to the repo.
Expand All @@ -356,8 +355,8 @@ sure the repository name and partition number are correct.
#[derive(Debug, RustcDecodable)]
#[allow(non_snake_case)]
struct Args {
flag_directory: Option<String>,
flag_partition: Option<String>,
arg_PATH: String,
flag_partition: Option<u64>,
flag_generate: Option<usize>,
flag_create: bool,
flag_snapshot: bool,
Expand All @@ -377,34 +376,25 @@ fn main() {
.and_then(|d| d.decode())
.unwrap_or_else(|e| e.exit());

let dir = PathBuf::from(match args.flag_directory {
Some(dir) => dir,
None => {
println!("Error: --directory option required (use --help for usage)");
exit(1);
},
});
if fs::create_dir_all(&dir).is_err() {
println!("Unable to create/find directory {}", dir.display());
exit(1);
}

let mode = match args.flag_generate {
Some(num) => Mode::Generate(num),
None => Mode::None,
};

let repetitions = args.flag_repeat.unwrap_or(1);

let result = run(&dir, args.flag_partition, mode, args.flag_create,
let result = run(Path::new(&args.arg_PATH), args.flag_partition,
mode, args.flag_create,
args.flag_snapshot, repetitions);
if let Err(e) = result {
println!("Error: {}", e);
exit(1);
}
}

fn run(dir: &Path, part_basename: Option<String>, mode: Mode, create: bool,
// part_num: None for repo mode, Some(PN) for partition mode, where PN may be
// 0 (auto mode) or a partition number
fn run(path: &Path, part_num: Option<u64>, mode: Mode, create: bool,
snapshot: bool, repetitions: usize) -> Result<()>
{
let solver1 = AncestorSolver2W::new();
Expand Down Expand Up @@ -448,16 +438,15 @@ fn run(dir: &Path, part_basename: Option<String>, mode: Mode, create: bool,
Mode::None => {},
};

if let Some(basename) = part_basename {
let mut io = Box::new(try!(DiscoverPartFiles::from_dir_basename(dir, &basename, None)));
if io.part_id() == None {
// On creation or where discovery fails we need a number:
io.set_part_id(PartId::from_num(1));
}

if let Some(pn) = part_num {
let mut part = if create {
// On creation we need a number; 0 here means "default":
let part_id = PartId::from_num(if pn == 0 { 1 } else { pn });
let io = Box::new(fileio::PartFileIO::new_empty(part_id, path.join("seqdb")));
try!(Partition::<Sequence>::create(io, "sequences db", vec![].into()))
} else {
let part_id = if pn != 0 { Some(PartId::from_num(pn)) } else { None };
let io = Box::new(try!(discover::part_from_path(path, part_id)));
let mut part = try!(Partition::<Sequence>::open(io));
try!(part.load(false));
part
Expand All @@ -483,7 +472,7 @@ fn run(dir: &Path, part_basename: Option<String>, mode: Mode, create: bool,
try!(part.write_snapshot(vec![].into()));
}
} else {
let discover = try!(DiscoverRepoFiles::from_dir(dir));
let discover = try!(discover::RepoFileIO::from_dir(path));
let rt = SeqRepo::new(discover);

let mut repo = if create {
Expand Down
25 changes: 9 additions & 16 deletions src/detail/part.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ use detail::states::{PartStateSumComparator};
use detail::{Commit, ExtraMeta, CommitQueue, LogReplay};
use merge::{TwoWayMerge, TwoWaySolver};
use {ElementT, Sum, PartId};
use error::{Result, ArgError, TipError, PatchOp, MatchError, OtherError, make_io_err};
use error::{Result, TipError, PatchOp, MatchError, OtherError, make_io_err};

/// An interface providing read and/or write access to a suitable location.
///
Expand All @@ -32,13 +32,8 @@ pub trait PartIO {
/// Convert self to a `&Any`
fn as_any(&self) -> &Any;

/// Return the partition identifier (number), if known.
///
/// Note that this is currently required, though there is still the
/// possibility of adapting `Partition` to discover the number when loading
/// a file (the issue being that it must initially exist without knowing
/// its number).
fn part_id(&self) -> Option<PartId>;
/// Return the partition identifier.
fn part_id(&self) -> PartId;

/// Return one greater than the snapshot number of the latest snapshot file
/// or log file found.
Expand Down Expand Up @@ -137,7 +132,7 @@ impl DummyPartIO {

impl PartIO for DummyPartIO {
fn as_any(&self) -> &Any { self }
fn part_id(&self) -> Option<PartId> { Some(self.part_id) }
fn part_id(&self) -> PartId { self.part_id }
fn ss_len(&self) -> usize { 0 }
fn ss_cl_len(&self, _ss_num: usize) -> usize { 0 }
fn read_ss(&self, _ss_num: usize) -> Result<Option<Box<Read+'static>>> {
Expand Down Expand Up @@ -239,8 +234,7 @@ impl<E: ElementT> Partition<E> {
{
try!(validate_repo_name(name));
let ss = 0;
let part_id = try!(io.part_id().ok_or(
ArgError::new("PartIO's `part_id()` must not return None")));
let part_id = io.part_id();
info!("Creating partiton {}; writing snapshot {}", part_id, ss);

let state = PartState::new(part_id);
Expand Down Expand Up @@ -288,15 +282,14 @@ impl<E: ElementT> Partition<E> {
/// ```no_run
/// use std::path::Path;
/// use pippin::Partition;
/// use pippin::discover::DiscoverPartFiles;
/// use pippin::discover;
///
/// let path = Path::new(".");
/// let io = DiscoverPartFiles::from_dir_basename(path, "my-partition", None).unwrap();
/// let path = Path::new("./my-partition");
/// let io = discover::part_from_path(path, None).unwrap();
/// let partition = Partition::<String>::open(Box::new(io));
/// ```
pub fn open(io: Box<PartIO>) -> Result<Partition<E>> {
let part_id = try!(io.part_id().ok_or(
ArgError::new("PartIO's `part_id()` must not return None")));
let part_id = io.part_id();
trace!("Opening partition {}", part_id);
Ok(Partition {
io: io,
Expand Down
3 changes: 0 additions & 3 deletions src/detail/repo_traits.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,6 @@ pub trait RepoIO {
/// fails if it is already taken. `prefix` is a relative path plus file-name
/// prefix, e.g. `data/misc-` would result in a snapshot having a name like
/// `misc-pn1-ss1.pip` inside the `data` subdirectory.
///
/// On success, returns the index of the new partition (for use with
/// `make_partition_io()`).
fn add_partition(&mut self, num: PartId, prefix: &str) -> Result<()>;

/// Construct and return a new PartIO for partition `num`.
Expand Down
Loading

0 comments on commit ea988f0

Please sign in to comment.