Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiply sparse matrices #132

Merged
merged 29 commits into from
Aug 23, 2016
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
46a0501
started a branch for multiplying sparse matrices against vectors.
rhl- Jan 27, 2016
ecab33b
added values removed MultiplyCSR.
rhl- Feb 3, 2016
114ea14
implemented multiply for graphs and distgraphs.
rhl- Feb 4, 2016
e08607b
implemented multiply for graphs and distgraphs.
rhl- Jan 27, 2016
babba77
fixed this.
rhl- Feb 7, 2016
8e04d35
compiles finally.
rhl- Feb 8, 2016
70c8108
Merge branch 'master' into multiply_sparse_matrices
rhl- Feb 19, 2016
83812d9
Merge branch 'multiply_sparse_matrices' of github.com:rhl-/Elemental …
rhl- Feb 19, 2016
c78c650
Merge branch 'master' into multiply_sparse_matrices
rhl- Aug 1, 2016
2dbc810
I believe things will now compile.
rhl- Aug 1, 2016
17ef6b6
Merge branch 'master' of git://github.com/elemental/Elemental into mu…
rhl- Aug 1, 2016
8b85694
towards multiplying sparse matrices
rhl- Aug 3, 2016
b18c7e5
towards multiplying sparse matrices
rhl- Aug 3, 2016
418f3b6
Merge branch 'master' of git://github.com/elemental/Elemental into mu…
rhl- Aug 20, 2016
dbb84c8
temporarily disabling one new function as I cant figure out macro sys…
rhl- Aug 20, 2016
e07bcdb
Merge branch 'master' of git://github.com/elemental/Elemental into mu…
rhl- Aug 20, 2016
8cd5636
merged in jacks changes. need to figure out whats wrong with template…
rhl- Aug 20, 2016
9842229
hm still broken.
rhl- Aug 20, 2016
a43cb33
put back multivec.
rhl- Aug 20, 2016
edfda9e
trying to get unit test compiling.
rhl- Aug 20, 2016
adf8f2c
added a basic unit test.
rhl- Aug 20, 2016
32571ce
fixed introduced bug Y not x.
rhl- Aug 20, 2016
c762c2a
addresses review.
rhl- Aug 21, 2016
04e78df
fixed errors.
rhl- Aug 21, 2016
2f6048e
address review.
rhl- Aug 21, 2016
0a15f61
Odd that I missed a multMeta.
rhl- Aug 22, 2016
1a83151
fix Get v GetLocal weirdness.
rhl- Aug 22, 2016
a1e8579
fix Get v GetLocal weirdness.
rhl- Aug 22, 2016
2ead9d5
Addresses review.
rhl- Aug 23, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions include/El/blas_like/level1/DiagonalScale.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -231,8 +231,7 @@ void DiagonalScale
LogicError("The size of d must match the width of A");
)
A.InitializeMultMeta();
const auto& meta = A.multMeta;

const auto& meta = A.LockedDistGraph().multMeta;
// Pack the send values
const Int numSendInds = meta.sendInds.size();
vector<T> sendVals;
Expand Down
6 changes: 2 additions & 4 deletions include/El/blas_like/level1/DiagonalSolve.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -257,8 +257,7 @@ void DiagonalSolve
else
{
A.InitializeMultMeta();
const auto& meta = A.multMeta;

const auto& meta = A.LockedDistGraph().multMeta;
// Pack the send values
const Int numSendInds = meta.sendInds.size();
vector<F> sendVals;
Expand Down Expand Up @@ -308,8 +307,7 @@ void SymmetricDiagonalSolve
const Int firstLocalRow = d.FirstLocalRow();

A.InitializeMultMeta();
const auto& meta = A.multMeta;

const auto& meta = A.LockedDistGraph().multMeta;
// Pack the send values
const Int numSendInds = meta.sendInds.size();
vector<Real> sendVals( numSendInds );
Expand Down
1 change: 0 additions & 1 deletion include/El/blas_like/level1/EntrywiseMap.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,6 @@ void EntrywiseMap
B.remoteVals_.resize( numRemoteEntries );
for( Int k=0; k<numRemoteEntries; ++k )
B.remoteVals_[k] = func(A.remoteVals_[k]);
B.multMeta = A.multMeta;
B.ProcessQueues();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because a few lines early we issue:

B.distGraph_ = A.distGraph_;

which assigns the multMeta objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, agreed.

}

Expand Down
6 changes: 6 additions & 0 deletions include/El/blas_like/level3.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ void Multiply
T alpha, const SparseMatrix<T>& A, const Matrix<T>& X,
T beta, Matrix<T>& Y );

template<typename T>
void Multiply
( Orientation orientation,
T alpha, const Graph& A, const Matrix<T>& X,
T beta, Matrix<T>& Y );

template<typename T>
void Multiply
( Orientation orientation,
Expand Down
2 changes: 1 addition & 1 deletion include/El/core.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ class DistMatrix;
#include <El/core/Matrix/decl.hpp>
#include <El/core/Graph/decl.hpp>
#include <El/core/DistMap/decl.hpp>
#include <El/core/DistGraph/decl.hpp>
#include <El/core/DistGraph.hpp>
Copy link
Member

@poulson poulson Aug 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why include DistGraph.hpp rather than DistGraph/decl.hpp?

#include <El/core/SparseMatrix/decl.hpp>
#include <El/core/DistSparseMatrix/decl.hpp>
#include <El/core/DistMultiVec/decl.hpp>
Expand Down
202 changes: 201 additions & 1 deletion include/El/core/DistGraph.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,206 @@
#ifndef EL_CORE_DISTGRAPH_HPP
#define EL_CORE_DISTGRAPH_HPP

#include <El/core/DistGraph/decl.hpp>
#include <set>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for purposefully breaking the convention of splitting the declarations and implementations into decl.hpp and impl.hpp files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a very long time since I initially did these changes. However, this convention is a bit odd. It is not as if the impl.hpp is a .cpp file.

I propose that we (as much as possible) move to single header-only implementations.

Simply moving the decl/impl into one header is asking the pre-processor to do less work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice there is a lot of movement between the impl.hpp and an associated source file. For DistMatrix, there have been at least five such movements. If nothing else, consistency is important.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhl- In general, you're asking to spend way more time compiling if you combine them. It is very common for header dependencies and transitive dependencies (header or source) to need only the declarations, which means that all those compilation units don't need to pull in the implementation and don't need to be recompiled any time the impl.hpp changes (usually much more common than decl.hpp changes). We saw a tangible improvement in PETSc compile times (especially incremental) when we split some headers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jed,

Not sure this is accurate. This is a pure header only implementation, with
separate sources, so you have to spend more time because its not all in one
place. However, I agree that there is value for only having signatures. If
impl.hpp was impl.cpp, then what you are saying is accurate.

Right now the headers extern template instantiations for a zillion things,
this is assuredly makes the builds very slow, because template
instantiation is relatively slow.

I'm trying to back out these changes for now.

Best,
-rhl

On Mon, Aug 22, 2016 at 4:19 PM Jed Brown notifications@github.com wrote:

In include/El/core/DistGraph.hpp
#132 (comment):

@@ -11,6 +11,206 @@
#ifndef EL_CORE_DISTGRAPH_HPP
#define EL_CORE_DISTGRAPH_HPP

-#include <El/core/DistGraph/decl.hpp>

@rhl- https://github.com/rhl- In general, you're asking to spend way
more time compiling if you combine them. It is very common for header
dependencies and transitive dependencies (header or source) to need only
the declarations, which means that all those compilation units don't need
to pull in the implementation and don't need to be recompiled any time the
impl.hpp changes (usually much more common than decl.hpp changes). We saw a
tangible improvement in PETSc compile times (especially incremental) when
we split some headers.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/elemental/Elemental/pull/132/files/32571ce8e750f142519b75f791742417c39b4bf9#r75777403,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AATdUWJiGKjYBPYjywtwnQin6eGRvX_Oks5qii5lgaJpZM4HVOQ_
.

Copy link
Member

@jedbrown jedbrown Aug 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you quantify that extra time to pull in both parts from separate files versus a unified file?

Also, I really am referring to splitting headers into the part that only declares the types and the part that contains the API associated with that type. It is common for dependent headers to only need the declaration -- only compilation units that actually manipulate that type need to include the rest of the header. You'll find many such examples in /usr/include/**/*{fwd,decl,type,types}.h{,pp}. Boost is insistent about factoring headers in that way, for example.


namespace El {

struct DistGraphMultMeta
{
bool ready;
// NOTE: The 'send' and 'recv' roles reverse for adjoint multiplication
Int numRecvInds;
vector<int> sendSizes, sendOffs,
recvSizes, recvOffs;
vector<Int> sendInds, colOffs;

DistGraphMultMeta() : ready(false), numRecvInds(0) { }

void Clear()
{
ready = false;
numRecvInds = 0;
SwapClear( sendSizes );
SwapClear( recvSizes );
SwapClear( sendOffs );
SwapClear( recvOffs );
SwapClear( sendInds );
SwapClear( colOffs );
}

const DistGraphMultMeta& operator=( const DistGraphMultMeta& meta )
{
ready = meta.ready;
numRecvInds = meta.numRecvInds;
sendSizes = meta.sendSizes;
sendOffs = meta.sendOffs;
recvSizes = meta.recvSizes;
recvOffs = meta.recvOffs;
sendInds = meta.sendInds;
colOffs = meta.colOffs;
return *this;
}
};



using std::set;

// Forward declare ldl::DistFront
namespace ldl { template<typename F> struct DistFront; }

// Use a simple 1d distribution where each process owns a fixed number of
// sources:
// if last process, numSources - (commSize-1)*floor(numSources/commSize)
// otherwise, floor(numSources/commSize)
class DistGraph
{
public:
// Constructors and destructors
// ============================
DistGraph( mpi::Comm comm=mpi::COMM_WORLD );
DistGraph( Int numSources, mpi::Comm comm=mpi::COMM_WORLD );
DistGraph( Int numSources, Int numTargets, mpi::Comm comm=mpi::COMM_WORLD );
DistGraph( const Graph& graph );
// TODO: Move constructor
DistGraph( const DistGraph& graph );
~DistGraph();

// Assignment and reconfiguration
// ==============================

// Making a copy
// -------------
const DistGraph& operator=( const Graph& graph );
const DistGraph& operator=( const DistGraph& graph );
// TODO: Move assignment

// Make a copy of a subgraph
// -------------------------
DistGraph operator()( Range<Int> I, Range<Int> J ) const;
DistGraph operator()( Range<Int> I, const vector<Int>& J ) const;
DistGraph operator()( const vector<Int>& I, Range<Int> J ) const;
DistGraph operator()( const vector<Int>& I, const vector<Int>& J ) const;

// Changing the graph size
// -----------------------
void Empty( bool freeMemory=true );
void Resize( Int numVertices );
void Resize( Int numSources, Int numTargets );

// Changing the distribution
// -------------------------
void SetComm( mpi::Comm comm );

// Assembly
// --------
void Reserve( Int numLocalEdges, Int numRemoteEdges=0 );

// Safe edge insertion/removal procedure
void Connect( Int source, Int target );
void ConnectLocal( Int localSource, Int target );
void Disconnect( Int source, Int target );
void DisconnectLocal( Int localSource, Int target );

void FreezeSparsity() EL_NO_EXCEPT;
void UnfreezeSparsity() EL_NO_EXCEPT;
bool FrozenSparsity() const EL_NO_EXCEPT;

// For inserting/removing a sequence of edges and then forcing consistency
void QueueConnection( Int source, Int target, bool passive=false )
EL_NO_RELEASE_EXCEPT;
void QueueLocalConnection( Int localSource, Int target )
EL_NO_RELEASE_EXCEPT;
void QueueDisconnection( Int source, Int target, bool passive=false )
EL_NO_RELEASE_EXCEPT;
void QueueLocalDisconnection( Int localSource, Int target )
EL_NO_RELEASE_EXCEPT;
void ProcessQueues();
void ProcessLocalQueues();

// For manually modifying/accessing buffers
void ForceNumLocalEdges( Int numLocalEdges );
void ForceConsistency( bool consistent=true ) EL_NO_EXCEPT;
Int* SourceBuffer() EL_NO_EXCEPT;
Int* TargetBuffer() EL_NO_EXCEPT;
Int* OffsetBuffer() EL_NO_EXCEPT;
const Int* LockedSourceBuffer() const EL_NO_EXCEPT;
const Int* LockedTargetBuffer() const EL_NO_EXCEPT;
const Int* LockedOffsetBuffer() const EL_NO_EXCEPT;
void ComputeSourceOffsets();

// Queries
// =======

// High-level data
// ---------------
Int NumSources() const EL_NO_EXCEPT;
Int NumTargets() const EL_NO_EXCEPT;
Int NumEdges() const EL_NO_EXCEPT;
Int FirstLocalSource() const EL_NO_EXCEPT;
Int NumLocalSources() const EL_NO_EXCEPT;
Int NumLocalEdges() const EL_NO_EXCEPT;
Int Capacity() const EL_NO_EXCEPT;
bool LocallyConsistent() const EL_NO_EXCEPT;

// Distribution information
// ------------------------
mpi::Comm Comm() const EL_NO_EXCEPT;
Int Blocksize() const EL_NO_EXCEPT;
int SourceOwner( Int s ) const EL_NO_RELEASE_EXCEPT;
Int GlobalSource( Int sLoc ) const EL_NO_RELEASE_EXCEPT;
Int LocalSource( Int s ) const EL_NO_RELEASE_EXCEPT;

// Detailed local information
// --------------------------
Int Source( Int localEdge ) const EL_NO_RELEASE_EXCEPT;
Int Target( Int localEdge ) const EL_NO_RELEASE_EXCEPT;
Int SourceOffset( Int localSource ) const EL_NO_RELEASE_EXCEPT;
Int Offset( Int localSource, Int target ) const EL_NO_RELEASE_EXCEPT;
Int NumConnections( Int localSource ) const EL_NO_RELEASE_EXCEPT;

// Return the ratio of the maximum number of local edges to the
// total number of edges divided by the number of processes
double Imbalance() const EL_NO_RELEASE_EXCEPT;

mutable DistGraphMultMeta multMeta;
DistGraphMultMeta InitializeMultMeta() const;


void AssertConsistent() const;
void AssertLocallyConsistent() const;

private:
Int numSources_, numTargets_;
mpi::Comm comm_;
// Apparently calling MPI_Comm_size in an inner loop is a very bad idea...
int commSize_;
int commRank_;

Int blocksize_;
Int numLocalSources_;

bool frozenSparsity_ = false;
vector<Int> sources_, targets_;
set<pair<Int,Int>> markedForRemoval_;

vector<Int> remoteSources_, remoteTargets_;
vector<pair<Int,Int>> remoteRemovals_;

void InitializeLocalData();

// Helpers for local indexing
bool locallyConsistent_ = true;
vector<Int> localSourceOffsets_;

friend class Graph;
friend void Copy( const Graph& A, DistGraph& B );
friend void Copy( const DistGraph& A, Graph& B );
friend void Copy( const DistGraph& A, DistGraph& B );

template<typename F> friend class DistSparseMatrix;
};

} // namespace El

#endif // ifndef EL_CORE_DISTGRAPH_HPP
Loading