- System / Infra
- Compute & Storage
- Grid computing / Super computing
- Cloud services
- Tools
- CPU
- FPGA
- GPU
- TPU
- IPU
- Performance
- Contributing
- serveo.net - Serveo is an SSH server just for remote port forwarding. When a user connects to Serveo, they get a public URL that anybody can use to connect to their localhost server. See link for other SSH and related alternatives, useful to be able to serve resources across devices i.e. access GPU or other hardware accelerators from another device remotely. | How to forward my local port to public using Serveo? | Serveo on GitHub
- Inlets by Alex Ellis | Get started | Video
- Cray Computers | Artificial Intelligence | Accel AI | Cryp-em | Autonomous Vehicles | Geospatial AI
- GraphCore's IPU
- Lambda Labs
- NGD Systems: Technology | Solutions - High Compute Storage, Scalable Computational Storage [deadlink] | NGD Systems: Ensuring AI Advancement with Intelligent Storage
- Grid Engine: wikipedia | Univa website | Datasheet
- BOINC - High-Throughput Computing with BOINC | Tech Docs | Download BOINC | GitHub
- Cray Computers - Supercomputing as a Service
- vast.ai - GPU Sharing Economy. One simple interface to find the best cloud GPU rentals. Reduce cloud compute costs by 3X to 5X
- paperspace - The first cloud built for the future. Powering next-generation applications and cloud ML/AI pipelines. Paperspace is built to scale with your team - pay as you go option for individuals.
- valohai | docs | blogs | GitHub | Videos | Showcase | Slack | @valohaiai - Valohai is a machine learning platform. It runs your experiments in the cloud, tracks your experiment history and streamlines data science workflows. DEEP LEARNING MANAGEMENT PLATFORM. Machine Orchestration, Version Control and Pipeline Management for Deep Learning.
- Lambda Cloud GPU Instances - GPU Instances for Deep Learning & Machine Learning
- NavOps - Cloud Migration for HPC | Datasheet
- Verne Global: HPC Cloud | NVidia DGX Ready
- Weights and Biases | Learn more about WandB
- snakemake - The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Slides | PyPi
- plz - Plz (pronounced "please") runs your jobs storing code, input, outputs and results so that they can be queried programmatically.
- valohai | docs | blogs | GitHub | Videos | Showcase | Slack - Valohai is a machine learning platform. It runs your experiments in the cloud, tracks your experiment history and streamlines data science workflows. DEEP LEARNING MANAGEMENT PLATFORM. Machine Orchestration, Version Control and Pipeline Management for Deep Learning.
- Seldon - Model deployment platform, on kubernetes clusters. | docs | github | use-cases | blogs | videos
- kedra | docs | Kedro-Viz | kedro-examples - Kedro is a workflow development tool that helps you build data pipelines that are robust, scalable, deployable, reproducible and versioned.
- Lambda Stack - One-line installation of TensorFlow, Keras, Caffe, Caffe, CUDA, cuDNN, and NVIDIA Drivers for Ubuntu 16.04 and 18.04.
- Apache Airflow - Airflow is a platform to programmatically author, schedule and monitor workflows. Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies.
- Nextflow - Data-driven computational pipelines. Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages.
- StackHPC suites of repositories: AI, ML, DL, Cloud, HPC | StackHPC
- cortex - Machine learning deployment platform: Deploy machine learning models to production
- See also: Data > Programs and Tools
- Probing the CPU (Linux/MacOS)
- Zero overhead performance capturing: use
/proc/interrupts
and/proc/softirqs
- Non-zero overhead, less accurate: use the PMU (capture on- and off-core events)
- Zero overhead performance capturing: use
- Probing the CPU (Windows)
- perfview - general profiling on Windows
- perfview for .net - excellent overview by Sasha Goldshtein
- Intel
- Intel® Developer Zone
- Intel® AI Developer Home Page
- Intel® AI Developer Webinar Series | All webinars listing
- The PlaidML Tensor Compiler - webinar
- nGraph - Unlocking next-generation performance with deep learning compilers: webinar | slides | homepage | github
- Also see Intel in Courses
Thanks to the great minds on the mechanical sympathy mailing list for their responses to my queries on CPU probing.
- Using FPGAs for Datacenter Acceleration | Windows AI | Intel® Distribution of OpenVINO™ Toolkit: Develop Multiplatform Computer Vision Solutions
- Also see FPGA in Courses
- Know your GPU
- GPU Server 1 of 2 | GPU Server 2 of 2 | Applications of GPU servers - checkout the manufacturers
- Embedded Vision Solutions for NVIDIA Jetson Series | Embedded Vision Family Brochure
- Avermedia Box PC & Carrier (works with NVidia Jetson): 1 | 2
- How to harness the Powers of the Cloud TPU
- How-tos
- All tutorials
- Command-line interface
- Cloud TPU tools
- Performance Guide
- TPU Estimator API
- Using BFloat
- Advanced Guide to Inception V3 on Cloud TPU
- Examples
- GraphCore | Videos: Simon Knowles - More complex models and more powerful machines | Graphcore tech Concept | A new kind of hardware designed for machine intelligence - GraphCore Presentations | VIDEO: SCALING THROUGHPUT PROCESSORS FOR MACHINE INTELLIGENCE
- MLPerf - Fair and useful benchmarks for measuring training and inference performance of ML hardware, software, and services.
- MLPerf introduces machine learning inference benchmark suite...
- ONE DEEP LEARNING BENCHMARK TO RULE THEM ALL
- mlbench: Distributed Machine Learning Benchmark - A public and reproducible collection of reference implementations and benchmark suite for distributed machine learning algorithms, frameworks and systems.
- EEMBC MLMark Benchmark - The EEMBC MLMark benchmark is a machine-learning (ML) benchmark designed to measure the performance and accuracy of embedded inference.
- DeepOBS: A Deep Learning Optimizer Benchmark Suite
- PMLB - a large benchmark suite for machine learning evaluation and comparison
- Deep Learning Benchmarking Suite | HPE Deep Learning Cookbook
Contributions are very welcome, please share back with the wider community (and get credited for it)!
Please have a look at the CONTRIBUTING guidelines, also have a read about our licensing policy.
Back to main page (table of contents)