Skip to content

Commit

Permalink
A better README file (#1079)
Browse files Browse the repository at this point in the history
* Updated the tutorial document.

1. Corrected the spelling mistake -> (sigular to single)
2. Corrected the statement -> the number of dimensions is the rank of the array.
3. Made 2 more small changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

* Updated README.md file, Added contribution Guidelines section, Updated the installation, and Hacking sections with code snippets.

* Added scripts, in the Getting Started section, inspired from the README of Tensorflow.

* Added resources list in the Getting started section, and Updated the contributing Guidelines sections, (inspired from Numpy, Scipy's README).

* Provided a quick guide that tells everything about the GSoC 2023 program in the contributing Guidelines section.

* Created a new file that will work as a GitHub action to check all markdown-files in the root of the repository for any broken links. (It runs only once a week).

* Changed one of the badge under the Project status.

Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>

* Added a section- New, which might be used to announce any news.

---------

Co-authored-by: SaiSuraj27 <87087741+SaiSuraj27@users.noreply.github.com>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
4 people authored Feb 9, 2023
1 parent bcea48a commit fade48d
Show file tree
Hide file tree
Showing 2 changed files with 117 additions and 45 deletions.
20 changes: 20 additions & 0 deletions .github/workflows/markdown-links-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: Markdown Links Check
# runs every monday at 9 am
on:
schedule:
- cron: "0 9 * * 1"

jobs:
check-links:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- uses: gaurav-nelson/github-action-markdown-link-check@v1
# checks all markdown files from root but ignores subfolders
# By Removing the max-depth variable we can modify it -> to check all the .md files in the entire repo.
with:
use-quiet-mode: 'yes'
# Specifying yes to show only errors in the output
use-verbose-mode: 'yes'
# Specifying yes to show detailed HTTP status for checked links.
max-depth: 0
142 changes: 97 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,23 +6,21 @@

Heat is a distributed tensor framework for high performance data analytics.

Project Status
--------------
# Project Status

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2531472.svg)](https://doi.org/10.5281/zenodo.2531472)
[![Mirror and run GitLab CI](https://github.com/helmholtz-analytics/heat/actions/workflows/ci_cb.yml/badge.svg)](https://github.com/helmholtz-analytics/heat/actions/workflows/ci_cb.yml)
[![Documentation Status](https://readthedocs.org/projects/heat/badge/?version=latest)](https://heat.readthedocs.io/en/latest/?badge=latest)
[![codecov](https://codecov.io/gh/helmholtz-analytics/heat/branch/main/graph/badge.svg)](https://codecov.io/gh/helmholtz-analytics/heat)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![license: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://pepy.tech/badge/heat)](https://pepy.tech/project/heat)

NEW!
--------------
- [Quick Start](quick_start.md) for new users and contributors (Jan 14, 2023)

# New

[Quick Start](quick_start.md) for new users and contributors (Jan 14, 2023).

Goals
-----
# Goals

Heat is a flexible and seamless open-source software for high performance data
analytics and machine learning. It provides highly optimized algorithms and data
Expand All @@ -37,90 +35,145 @@ scientific and data science applications.
Heat allows you to tackle your actual Big Data challenges that go beyond the
computational and memory needs of your laptop and desktop.

Features
--------
# Features

* High-performance n-dimensional tensors
* CPU, GPU and distributed computation using MPI
* Powerful data analytics and machine learning methods
* Abstracted communication via split tensors
* Python API

Getting Started
---------------

TL;DR: [Quick Start](quick_start.md)

Check out our Jupyter Notebook [tutorial]((https://github.com/helmholtz-analytics/heat/blob/main/scripts/)tutorial.ipynb)
right here on Github or in the /scripts directory.

The complete documentation of the latest version is always deployed on
[Read the Docs](https://heat.readthedocs.io/).

Support Channels
----------------
# Support Channels

We use [StackOverflow](https://stackoverflow.com/tags/pyheat/) as a forum for questions about Heat.
If you do not find an answer to your question, then please ask a new question there and be sure to
tag it with "pyheat".

You can also reach us on [GitHub Discussions](https://github.com/helmholtz-analytics/heat/discussions).

Requirements
------------
# Requirements

Heat requires Python 3.7 or newer.
Heat is based on [PyTorch](https://pytorch.org/). Specifically, we are exploiting
PyTorch's support for GPUs *and* MPI parallelism. For MPI support we utilize
[mpi4py](https://mpi4py.readthedocs.io). Both packages can be installed via pip
or automatically using the setup.py.


Installation
------------

TL;DR: [Quick Start](quick_start.md)
# Installation

Tagged releases are made available on the
[Python Package Index (PyPI)](https://pypi.org/project/heat/). You can typically
install the latest version with

> $ pip install heat[hdf5,netcdf]
```
$ pip install heat[hdf5,netcdf]
```

where the part in brackets is a list of optional dependencies. You can omit
it, if you do not need HDF5 or NetCDF support.

**It is recommended to use the most recent supported version of PyTorch!**

It is also very important to ensure that the PyTorch version is compatible with the local CUDA installation.
**It is also very important to ensure that the PyTorch version is compatible with the local CUDA installation.**
More information can be found [here](https://pytorch.org/get-started/locally/).

Hacking
-------
TL;DR: [Quick Start](quick_start.md)
# Hacking

If you want to work with the development version, you can check out the sources using

> $ git clone https://github.com/helmholtz-analytics/heat.git
```
$ git clone <https://github.com/helmholtz-analytics/heat.git>
```

The installation can then be done from the checked-out sources with

> $ pip install .[hdf5,netcdf,dev]
```
$ pip install heat[hdf5,netcdf,dev]
```

# Getting Started

TL;DR: [Quick Start](quick_start.md) (Read this to get a quick overview of Heat).

Check out our Jupyter Notebook [**Tutorial**](https://github.com/helmholtz-analytics/heat/blob/main/scripts/)
right here on Github or in the /scripts directory, to learn and understand about the basics and working of Heat.

The complete documentation of the latest version is always deployed on
[Read the Docs](https://heat.readthedocs.io/).

***Try your first Heat program***

```shell
$ python
```

```python
>>> import heat as ht
>>> x = ht.arange(10,split=0)
>>> print(x)
DNDarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=ht.int32, device=cpu:0, split=0)
>>> y = ht.ones(10,split=0)
>>> print(y)
DNDarray([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=ht.float32, device=cpu:0, split=0)
>>> print(x + y)
DNDarray([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.], dtype=ht.float32, device=cpu:0, split=0)
```

### Also, you can test your setup by running the [`heat_test.py`](https://github.com/helmholtz-analytics/heat/blob/main/scripts/heat_test.py) script:

```shell
mpirun -n 2 python heat_test.py
```

### It should print something like this:

```shell
x is distributed: True
Global DNDarray x: DNDarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=ht.int32, device=cpu:0, split=0)
Global DNDarray x:
Local torch tensor on rank 0 : tensor([0, 1, 2, 3, 4], dtype=torch.int32)
Local torch tensor on rank 1 : tensor([5, 6, 7, 8, 9], dtype=torch.int32)
```

## Resources:

* [Heat Tutorials](https://heat.readthedocs.io/en/latest/tutorials.html)
* [Heat API Reference](https://heat.readthedocs.io/en/latest/autoapi/index.html)

### Parallel Computing and MPI:

* @davidhenty's [course](https://www.archer2.ac.uk/training/courses/200514-mpi/)
* Wes Kendall's [Tutorials](https://mpitutorial.com/tutorials/)

### mpi4py

* [mpi4py docs](https://mpi4py.readthedocs.io/en/stable/tutorial.html)
* [Tutorial](https://www.kth.se/blogs/pdc/2019/08/parallel-programming-in-python-mpi4py-part-1/)

# Contribution guidelines

**We welcome contributions from the community, if you want to contribute to Heat, be sure to review the [Contribution Guidelines](contributing.md) before getting started!**

We use [GitHub issues](https://github.com/helmholtz-analytics/heat/issues) for tracking requests and bugs, please see [Discussions](https://github.com/helmholtz-analytics/heat/discussions) for general questions and discussion, and You can also get in touch with us on [Mattermost](https://mattermost.hzdr.de/signup_user_complete/?id=3sixwk9okpbzpjyfrhen5jpqfo). You can sign up with your GitHub credentials. Once you log in, you can introduce yourself on the `Town Square` channel.

Small improvements or fixes are always appreciated; issues labeled as **"good first issue"** may be a good starting point.

If you’re unsure where to start or how your skills fit in, reach out! You can ask us here on GitHub, by leaving a comment on a relevant issue that is already open.

**If you are new to contributing to open source, [this guide](https://opensource.guide/how-to-contribute/) helps explain why, what, and how to get involved.**

We welcome contributions from the community, please check out our [Contribution Guidelines](contributing.md) before getting started!
### For people who want to contribute through GSoC 2023 program, here is a [quick Guide](https://github.com/MLSC-BSOITR/Ultimate-GSOC-Guide/blob/main/GSoC2023Presentation.pdf) about the complete program.

License
-------
# License

Heat is distributed under the MIT license, see our
[LICENSE](LICENSE) file.

Citing Heat
-----------
# Citing Heat

If you find Heat helpful for your research, please mention it in your publications. You can cite:

- Götz, M., Debus, C., Coquelin, D., Krajsek, K., Comito, C., Knechtges, P., Hagemeier, B., Tarnawa, M., Hanselmann, S., Siggel, S., Basermann, A. & Streit, A. (2020). HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 276-287). IEEE, DOI: 10.1109/BigData50022.2020.9378050.
* Götz, M., Debus, C., Coquelin, D., Krajsek, K., Comito, C., Knechtges, P., Hagemeier, B., Tarnawa, M., Hanselmann, S., Siggel, S., Basermann, A. & Streit, A. (2020). HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 276-287). IEEE, DOI: 10.1109/BigData50022.2020.9378050.

```
@inproceedings{heat2020,
Expand Down Expand Up @@ -148,8 +201,7 @@ If you find Heat helpful for your research, please mention it in your publicatio
}
```

Acknowledgements
----------------
## Acknowledgements

*This work is supported by the [Helmholtz Association Initiative and
Networking Fund](https://www.helmholtz.de/en/about_us/the_association/initiating_and_networking/)
Expand Down

0 comments on commit fade48d

Please sign in to comment.