Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
ec5d0df
Merge pull request #1 from mlcommons/main
gfursin Sep 20, 2024
c1eb857
Merge pull request #2 from mlcommons/main
gfursin Sep 24, 2024
c97f1fe
Merge pull request #3 from mlcommons/main
gfursin Sep 26, 2024
df4013c
remove sys deps from script "python hello world"
Sep 27, 2024
9a7057e
* added "dummy" script to test Docker containers
Sep 27, 2024
6b02a9d
* added better support to select Docker configurations via UID
Sep 27, 2024
5c74925
fixing docker cfg selection
Sep 27, 2024
8dbc038
clean up
Sep 27, 2024
653270c
Merge branch 'main' of github.com:flexaihq/cm4mlops
Sep 27, 2024
2087704
clean up
Sep 27, 2024
773ef26
removed ^M from setup.py
Sep 27, 2024
de55cc1
improving docker docs
Sep 27, 2024
a8feebd
Merge branch 'main' of github.com:flexaihq/cm4mlops
Sep 27, 2024
c5aeb3d
minor fix in get-cuda-devices
Sep 27, 2024
9b79c55
added --docker.key = value to cm docker script
Sep 29, 2024
07ef382
docker fixes
Sep 29, 2024
2924ce6
added ubuntu 24.04 config
Sep 29, 2024
e3b43b8
removed sys deps from image-classification examples
Sep 30, 2024
62c38ef
removed sys deps from image-classification examples and added YAML
Sep 30, 2024
f8a4e7a
fixing debug examples for customize.py and wrapped Python code (exter…
Sep 30, 2024
79871c0
adding latest wget.exe to get-sys-utils-cm
Oct 1, 2024
1051735
Merge pull request #4 from mlcommons/main
gfursin Oct 1, 2024
2593657
Merge pull request #315 from flexaihq/main
gfursin Oct 1, 2024
5419871
turn on tests on Windows
Oct 1, 2024
44251aa
Merge pull request #317 from flexaihq/main
gfursin Oct 1, 2024
a92f8d5
* removed windows test for MLPerf (requires interaction)
Oct 1, 2024
09a09a1
Merge pull request #320 from flexaihq/main
gfursin Oct 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/test-image-classification-onnx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ on:

jobs:
build:

runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
python-version: [ "3.12"]
os: [ubuntu-latest, windows-latest, macos-latest]
python-version: [ "3.10", "3.12"]

steps:
- uses: actions/checkout@v3
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/test-mlperf-inference-resnet50.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ jobs:
- os: macos-latest
backend: tf
- os: windows-latest
# MLPerf requires interaction when installing LLVM on Windows - that's why we excluded it here


steps:
- uses: actions/checkout@v4
Expand Down
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
### 20240927
* added "test dummy" script to test Docker containers
* added more standard Nvidia Docker configuration for PyTorch
* added better support to select Docker configurations via UID

### 20240916
* fixed "cm add script"

Expand Down
2 changes: 1 addition & 1 deletion COPYRIGHT.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Copyright (c) 2021-2024 MLCommons

The cTuning foundation and OctoML donated this project to MLCommons to benefit everyone.
Grigori Fursin, the cTuning foundation and OctoML donated this project to MLCommons to benefit everyone.

Copyright (c) 2014-2021 cTuning foundation
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,11 @@ cm run script \

[Apache 2.0](LICENSE.md)

## CM concepts

* https://doi.org/10.5281/zenodo.8105339
* https://arxiv.org/abs/2406.16791

## Authors

[Grigori Fursin](https://cKnowledge.org/gfursin) and [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh)
Expand Down
52 changes: 48 additions & 4 deletions automation/script/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -4028,14 +4028,58 @@ def docker(self, i):

(out) (str): if 'con', output to console

parsed_artifact (list): prepared in CM CLI or CM access function
[ (artifact alias, artifact UID) ] or
[ (artifact alias, artifact UID), (artifact repo alias, artifact repo UID) ]

(repos) (str): list of repositories to search for automations

(output_dir) (str): output directory (./ by default)

(docker) (dict): convert keys into docker_{key} strings for CM >= 2.3.8.1


(docker_skip_build) (bool): do not generate Dockerfiles and do not recreate Docker image (must exist)
(docker_noregenerate) (bool): do not generate Dockerfiles
(docker_norecreate) (bool): do not recreate Docker image

(docker_cfg) (str): if True, show all available basic docker configurations, otherwise pre-select one
(docker_cfg_uid) (str): if True, select docker configuration with this UID

(docker_path) (str): where to create or find Dockerfile
(docker_gh_token) (str): GitHub token for private repositories
(docker_save_script) (str): if !='' name of script to save docker command
(docker_interactive) (bool): if True, run in interactive mode
(docker_it) (bool): the same as `docker_interactive`
(docker_detached) (bool): detach Docker
(docker_dt) (bool) the same as `docker_detached`

(docker_base_image) (str): force base image
(docker_os) (str): force docker OS (default: ubuntu)
(docker_os_version) (str): force docker OS version (default: 22.04)
(docker_image_tag_extra) (str): add extra tag (default:-latest)

(docker_cm_repo) (str): force CM automation repository when building Docker (default: cm4mlops)
(docker_cm_repos)
(docker_cm_repo_flags)

(dockerfile_env)

(docker_skip_cm_sys_upgrade) (bool): if True, do not install CM sys deps

(docker_extra_sys_deps)

(fake_run_deps)
(docker_run_final_cmds)

(all_gpus)
(num_gpus)

(docker_device)

(docker_port_maps)

(docker_shm_size)

(docker_extra_run_args)


Returns:
(CM return dict):

Expand Down
57 changes: 30 additions & 27 deletions automation/script/module_misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1335,15 +1335,9 @@ def dockerfile(i):
Args:
(CM input dict):

(out) (str): if 'con', output to console

parsed_artifact (list): prepared in CM CLI or CM access function
[ (artifact alias, artifact UID) ] or
[ (artifact alias, artifact UID), (artifact repo alias, artifact repo UID) ]

(repos) (str): list of repositories to search for automations

(output_dir) (str): output directory (./ by default)
(out) (str): if 'con', output to console
(repos) (str): list of repositories to search for automations
(output_dir) (str): output directory (./ by default)

Returns:
(CM return dict):
Expand Down Expand Up @@ -1632,15 +1626,6 @@ def docker(i):

(out) (str): if 'con', output to console

(docker_skip_build) (bool): do not generate Dockerfiles and do not recreate Docker image (must exist)
(docker_noregenerate) (bool): do not generate Dockerfiles
(docker_norecreate) (bool): do not recreate Docker image

(docker_path) (str): where to create or find Dockerfile
(docker_gh_token) (str): GitHub token for private repositories
(docker_save_script) (str): if !='' name of script to save docker command
(docker_interactive) (bool): if True, run in interactive mode
(docker_cfg) (str): if True, show all available basic docker configurations, otherwise pre-select one

Returns:
(CM return dict):
Expand All @@ -1653,6 +1638,20 @@ def docker(i):
import copy
import re

from cmind import __version__ as current_cm_version

self_module = i['self_module']

if type(i.get('docker', None)) == dict:
# Grigori started cleaning and refactoring this code on 20240929
#
# 1. use --docker dictionary instead of --docker_{keys}

if utils.compare_versions(current_cm_version, '2.3.8.1') >= 0:
docker_params = utils.convert_dictionary(i['docker'], 'docker')
i.update(docker_params)
del(i['docker'])

quiet = i.get('quiet', False)

detached = i.get('docker_detached', '')
Expand All @@ -1670,13 +1669,12 @@ def docker(i):

# Check simplified CMD: cm docker script "python app image-classification onnx"
# If artifact has spaces, treat them as tags!
self_module = i['self_module']
self_module.cmind.access({'action':'detect_tags_in_artifact', 'automation':'utils', 'input':i})

# CAREFUL -> artifacts and parsed_artifacts are not supported in input (and should not be?)
if 'artifacts' in i: del(i['artifacts'])
if 'parsed_artifacts' in i: del(i['parsed_artifacts'])

# Prepare "clean" input to replicate command
r = self_module.cmind.access({'action':'prune_input', 'automation':'utils', 'input':i, 'extra_keys_starts_with':['docker_']})
i_run_cmd_arc = r['new_input']
Expand All @@ -1693,13 +1691,19 @@ def docker(i):

# Check available configurations
docker_cfg = i.get('docker_cfg', '')
if docker_cfg != '':
docker_cfg_uid = i.get('docker_cfg_uid', '')

if docker_cfg != '' or docker_cfg_uid != '':
# Check if docker_cfg is turned on but not selected
if type(docker_cfg) == bool or str(docker_cfg).lower() in ['true','yes']:
docker_cfg= ''

r = self_module.cmind.access({'action':'select_cfg', 'automation':'utils,dc2743f8450541e3',
'tags':'basic,docker,configurations', 'title':'docker', 'alias':docker_cfg})

r = self_module.cmind.access({'action':'select_cfg',
'automation':'utils,dc2743f8450541e3',
'tags':'basic,docker,configurations',
'title':'docker',
'alias':docker_cfg,
'uid':docker_cfg_uid})
if r['return'] > 0:
if r['return'] == 16:
return {'return':1, 'error':'Docker configuration {} was not found'.format(docker_cfg)}
Expand All @@ -1708,10 +1712,9 @@ def docker(i):
selection = r['selection']

docker_input_update = selection['meta']['input']

i.update(docker_input_update)


########################################################################################
# Run dockerfile
if not noregenerate_docker_file:
Expand All @@ -1722,7 +1725,7 @@ def docker(i):
cur_dir = os.getcwd()

console = i.get('out') == 'con'

# Search for script(s)
r = aux_search({'self_module': self_module, 'input': i})
if r['return']>0: return r
Expand Down
62 changes: 36 additions & 26 deletions automation/utils/module_cfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,16 +230,18 @@ def select_cfg(i):
self_module = i['self_module']
tags = i['tags']
alias = i.get('alias', '')
uid = i.get('uid', '')
title = i.get('title', '')

# Check if alias is not provided
r = self_module.cmind.access({'action':'find', 'automation':'cfg', 'tags':'basic,docker,configurations'})
if r['return'] > 0: return r

lst = r['list']

selector = []

# Do coarse-grain search for CM artifacts
for l in lst:
p = l.path

Expand All @@ -257,45 +259,53 @@ def select_cfg(i):
if not f.startswith('_cm') and (f.endswith('.json') or f.endswith('.yaml')):
selector.append({'path':os.path.join(p, f), 'alias':f[:-5]})

if len(selector) == 0:
return {'return':16, 'error':'configuration was not found'}

select = 0
if len(selector) > 1:
xtitle = ' ' + title if title!='' else ''
print ('')
print ('Available{} configurations:'.format(xtitle))

print ('')
# Load meta for name and UID
selector_with_meta = []
for s in range(0, len(selector)):
ss = selector[s]

for s in range(0, len(selector)):
ss = selector[s]
path = ss['path']

path = ss['path']
full_path_without_ext = path[:-5]

full_path_without_ext = path[:-5]
r = cmind.utils.load_yaml_and_json(full_path_without_ext)
if r['return']>0:
print ('Warning: problem loading configuration file {}'.format(path))

r = cmind.utils.load_yaml_and_json(full_path_without_ext)
if r['return']>0:
print ('Warning: problem loading configuration file {}'.format(path))
meta = r['meta']

meta = r['meta']
if uid == '' or meta.get('uid', '') == uid:
ss['meta'] = meta
selector_with_meta.append(ss)

# Quit if no configurations found
if len(selector_with_meta) == 0:
return {'return':16, 'error':'configuration was not found'}

selector = sorted(selector, key = lambda x: x['meta'].get('name',''))
select = 0
if len(selector_with_meta) > 1:
xtitle = ' ' + title if title!='' else ''
print ('')
print ('Available{} configurations:'.format(xtitle))

print ('')

selector_with_meta = sorted(selector_with_meta, key = lambda x: x['meta'].get('name',''))
s = 0
for ss in selector:
for ss in selector_with_meta:
alias = ss['alias']
name = ss['meta'].get('name','')
uid = ss['meta'].get('uid', '')
name = ss['meta'].get('name', '')

x = name
if x!='': x+=' '
x += '('+alias+')'
print ('{}) {}'.format(s, x))
x += '(' + uid + ')'

print (f'{s}) {x}'.format(s, x))

s+=1

print ('')
select = input ('Enter configuration number of press Enter for 0: ')

Expand All @@ -306,6 +316,6 @@ def select_cfg(i):
if select<0 or select>=len(selector):
return {'return':1, 'error':'selection is out of range'}

ss = selector[select]
ss = selector_with_meta[select]

return {'return':0, 'selection':ss}
39 changes: 39 additions & 0 deletions cfg/benchmark-run-mlperf-inference-v4.1/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
alias: benchmark-run-mlperf-inference-v4.1
uid: b7e89771987d4168

automation_alias: cfg
automation_uid: 88dce9c160324c5d

tags:
- benchmark
- run
- mlperf
- inference
- v4.1

name: "MLPerf inference - v4.1"

supported_compute:
- ee8c568e0ac44f2b
- fe379ecd1e054a00
- d8f06040f7294319

bench_uid: 39877bb63fb54725

view_dimensions:
- - input.device
- "MLPerf device"
- - input.implementation
- "MLPerf implementation"
- - input.backend
- "MLPerf backend"
- - input.model
- "MLPerf model"
- - input.scenario
- "MLPerf scenario"
- - input.host_os
- "Host OS"
- - output.state.cm-mlperf-inference-results-last.performance
- "Got performance"
- - output.state.cm-mlperf-inference-results-last.accuracy
- "Got accuracy"
9 changes: 9 additions & 0 deletions cfg/docker-basic-configurations/basic-ubuntu-24.04.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
uid: 12e86eb386314866

name: "Basic Ubuntu 24.04"

input:
docker_base_image: 'ubuntu:24.04'
docker_os: ubuntu
docker_os_version: '24.04'

Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
uid: 854e65fb31584d63

name: "Nvidia Ubuntu 20.04 CUDA 11.8 cuDNN 8.6.0 PyTorch 1.13.0"
name: "Nvidia Ubuntu 20.04 CUDA 11.8 cuDNN 8.6.0 PyTorch 1.13.0 (pytorch:22.10)"

ref_url: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-22-10.html

input:
docker_base_image: 'nvcr.io/nvidia/pytorch:22.10-py3'
Expand Down
Loading