Skip to content

Commit

Permalink
Migrate hyperzoo (intel-analytics#4958)
Browse files Browse the repository at this point in the history
* add hyperzoo for k8s support (intel-analytics#2140)

* add hyperzoo for k8s support

* format

* format

* format

* format

* run examples on k8s readme (intel-analytics#2163)

* k8s  readme

* fix jdk download issue (intel-analytics#2219)

* add doc for submit jupyter notebook and cluster serving to k8s (intel-analytics#2221)

* add hyperzoo doc

* add hyperzoo doc

* add hyperzoo doc

* add hyperzoo doc

* fix jdk download issue (intel-analytics#2223)

* bump to 0.9s (intel-analytics#2227)

* update jdk download url (intel-analytics#2259)

* update some previous docs (intel-analytics#2284)

* K8docsupdate (intel-analytics#2306)

* Update README.md

* Update s3 related links in readme  and documents (intel-analytics#2489)

* Update s3 related links in readme  and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* update

* update

* modify line length limit

* update

* Update mxnet-mkl version in hyper-zoo dockerfile (intel-analytics#2720)

Co-authored-by: gaoping <pingx.gao@intel.com>

* update bigdl version (intel-analytics#2743)

* update bigdl version

* hyperzoo dockerfile add cluster-serving (intel-analytics#2731)

* hyperzoo dockerfile add cluster-serving

* update

* update

* update

* update jdk url

* update jdk url

* update

Co-authored-by: gaoping <pingx.gao@intel.com>

* Support init_spark_on_k8s (intel-analytics#2813)

* initial

* fix

* code refactor

* bug fix

* update docker

* style

* add conda to docker image (intel-analytics#2894)

* add conda to docker image

* Update Dockerfile

* Update Dockerfile

Co-authored-by: glorysdj <glorysdj@gmail.com>

* Fix code blocks indents in .md files (intel-analytics#2978)

* Fix code blocks indents in .md files

Previously a lot of the code blocks in markdown files were horribly indented with bad white spaces in the beginning of lines. Users can't just select, copy, paste, and run (in the case of python). I have fixed all these, so there is no longer any code block with bad white space at the beginning of the lines.
It would be nice if you could try to make sure in future commits that all code blocks are properly indented inside and have the right amount of white space in the beginning!

* Fix small style issue

* Fix indents

* Fix indent and add \ for multiline commands

Change indent from 3 spaces to 4, and add "\" for multiline bash commands

Co-authored-by: Yifan Zhu <fanzhuyifan@gmail.com>

* enable bigdl 0.12 (intel-analytics#3101)

* switch to bigdl 0.12

* Hyperzoo example ref (intel-analytics#3143)

* specify pip version to fix oserror 0 of proxy (intel-analytics#3165)

* Bigdl0.12.1 (intel-analytics#3155)

* bigdl 0.12.1

* bump 0.10.0-Snapshot (intel-analytics#3237)

* update runtime image name (intel-analytics#3250)

* update jdk download url (intel-analytics#3316)

* update jdk8 url (intel-analytics#3411)

Co-authored-by: ardaci <dongjie.shi@intel.com>

* update hyperzoo docker image (intel-analytics#3429)

* update hyperzoo image (intel-analytics#3457)

* fix jdk in az docker (intel-analytics#3478)

* fix jdk in az docker

* fix jdk for hyperzoo

* fix jdk in jenkins docker

* fix jdk in cluster serving docker

* fix jdk

* fix readme

* update python dep to fit cnvrg (intel-analytics#3486)

* update ray version doc (intel-analytics#3568)

* fix deploy hyperzoo issue (intel-analytics#3574)

Co-authored-by: gaoping <pingx.gao@intel.com>

* add spark fix and net-tools and status check (intel-analytics#3742)

* intsall netstat and add check status

* add spark fix for graphene

* bigdl 0.12.2 (intel-analytics#3780)

* bump to 0.11-S and fix version issues except ipynb

* add multi-stage build Dockerfile (intel-analytics#3916)

* add multi-stage build Dockerfile

* multi-stage build dockerfile

* multi-stage build dockerfile

* Rename Dockerfile.multi to Dockerfile

* delete Dockerfile.multi

* remove comments, add TINI_VERSION to common arg, remove Dockerfile.multi

* multi-stage add tf_slim

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* update hyperzoo doc and k8s doc (intel-analytics#3959)

* update userguide of k8s

* update k8s guide

* update hyperzoo doc

* Update k8s.md

add note

* Update k8s.md

add note

* Update k8s.md

update notes

* fix 4087 issue (intel-analytics#4097)

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* fixed 4086 and 4083 issues (intel-analytics#4098)

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* Reduce image size (intel-analytics#4132)

* Reduce Dockerfile size
1. del redis stage
2. del flink stage
3. del conda & exclude some python packages
4. add copies layer stage

* update numpy version to 1.18.1

Co-authored-by: zzti-bsj <shaojiex.bai@intel.com>

* update hyperzoo image (intel-analytics#4250)

Co-authored-by: Adria777 <Adria777@github.com>

* bigdl 0.13 (intel-analytics#4210)

* bigdl 0.13

* update

* print exception

* pyspark2.4.6

* update release PyPI script

* update

* flip snapshot-0.12.0 and spark2.4.6 (intel-analytics#4254)

* s-0.12.0 master

* Update __init__.py

* Update python.md

* fix docker issues due to version update (intel-analytics#4280)

* fix docker issues

* fix docker issues

* update Dockerfile to support spark 3.1.2 && 2.4.6 (intel-analytics#4436)

Co-authored-by: shaojie <otnw_bsj@163.com>

* update hyperzoo, add lib for tf2 (intel-analytics#4614)

* delete tf 1.15.0 (intel-analytics#4719)

Co-authored-by: Le-Zheng <30695225+Le-Zheng@users.noreply.github.com>
Co-authored-by: pinggao18 <44043817+pinggao18@users.noreply.github.com>
Co-authored-by: pinggao187 <44044110+pinggao187@users.noreply.github.com>
Co-authored-by: gaoping <pingx.gao@intel.com>
Co-authored-by: Kai Huang <huangkaivision@gmail.com>
Co-authored-by: GavinGu07 <55721214+GavinGu07@users.noreply.github.com>
Co-authored-by: Yifan Zhu <zhuyifan@stanford.edu>
Co-authored-by: Yifan Zhu <fanzhuyifan@gmail.com>
Co-authored-by: Song Jiaming <litchy233@gmail.com>
Co-authored-by: ardaci <dongjie.shi@intel.com>
Co-authored-by: Yang Wang <yang3.wang@intel.com>
Co-authored-by: zzti-bsj <2779090360@qq.com>
Co-authored-by: shaojie <shaojiex.bai@intel.com>
Co-authored-by: Lingqi Su <33695124+Adria777@users.noreply.github.com>
Co-authored-by: Adria777 <Adria777@github.com>
Co-authored-by: shaojie <otnw_bsj@163.com>
  • Loading branch information
17 people authored Oct 14, 2021
1 parent f52b8e3 commit 77da6ed
Show file tree
Hide file tree
Showing 17 changed files with 2,361 additions and 0 deletions.
176 changes: 176 additions & 0 deletions docker/hyperzoo/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
ARG SPARK_VERSION=2.4.6
ARG SPARK_HOME=/opt/spark
ARG JDK_VERSION=8u192
ARG JDK_URL=your_jdk_url
ARG BIGDL_VERSION=0.13.0
ARG ANALYTICS_ZOO_VERSION=0.12.0-SNAPSHOT
ARG TINI_VERSION=v0.18.0

# stage.1 jdk & spark
FROM ubuntu:18.04 as spark
ARG SPARK_VERSION
ARG JDK_VERSION
ARG JDK_URL
ARG SPARK_HOME
ENV TINI_VERSION v0.18.0
ENV SPARK_VERSION ${SPARK_VERSION}
ENV SPARK_HOME ${SPARK_HOME}
RUN apt-get update --fix-missing && \
apt-get install -y apt-utils vim curl nano wget unzip maven git && \
# java
wget $JDK_URL && \
gunzip jdk-$JDK_VERSION-linux-x64.tar.gz && \
tar -xf jdk-$JDK_VERSION-linux-x64.tar -C /opt && \
rm jdk-$JDK_VERSION-linux-x64.tar && \
mv /opt/jdk* /opt/jdk$JDK_VERSION && \
ln -s /opt/jdk$JDK_VERSION /opt/jdk && \
# spark
wget https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz && \
tar -zxvf spark-${SPARK_VERSION}-bin-hadoop2.7.tgz && \
mv spark-${SPARK_VERSION}-bin-hadoop2.7 /opt/spark && \
rm spark-${SPARK_VERSION}-bin-hadoop2.7.tgz && \
cp /opt/spark/kubernetes/dockerfiles/spark/entrypoint.sh /opt

RUN ln -fs /bin/bash /bin/sh
RUN if [ $SPARK_VERSION = "3.1.2" ]; then \
rm $SPARK_HOME/jars/okhttp-*.jar && \
wget -P $SPARK_HOME/jars https://repo1.maven.org/maven2/com/squareup/okhttp3/okhttp/3.8.0/okhttp-3.8.0.jar; \
elif [ $SPARK_VERSION = "2.4.6" ]; then \
rm $SPARK_HOME/jars/kubernetes-client-*.jar && \
wget -P $SPARK_HOME/jars https://repo1.maven.org/maven2/io/fabric8/kubernetes-client/4.4.2/kubernetes-client-4.4.2.jar; \
fi

ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /sbin/tini

# stage.2 analytics-zoo
FROM ubuntu:18.04 as analytics-zoo
ARG SPARK_VERSION
ARG BIGDL_VERSION
ARG ANALYTICS_ZOO_VERSION

ENV SPARK_VERSION ${SPARK_VERSION}
ENV BIGDL_VERSION ${BIGDL_VERSION}
ENV ANALYTICS_ZOO_VERSION ${ANALYTICS_ZOO_VERSION}
ENV ANALYTICS_ZOO_HOME /opt/analytics-zoo-${ANALYTICS_ZOO_VERSION}

RUN apt-get update --fix-missing && \
apt-get install -y apt-utils vim curl nano wget unzip maven git
ADD ./download-analytics-zoo.sh /opt

RUN chmod a+x /opt/download-analytics-zoo.sh && \
mkdir -p /opt/analytics-zoo-examples/python
RUN /opt/download-analytics-zoo.sh && \
rm analytics-zoo-bigdl*.zip && \
unzip $ANALYTICS_ZOO_HOME/lib/*.zip 'zoo/examples/*' -d /opt/analytics-zoo-examples/python && \
mv /opt/analytics-zoo-examples/python/zoo/examples/* /opt/analytics-zoo-examples/python && \
rm -rf /opt/analytics-zoo-examples/python/zoo/examples

# stage.3 copies layer
FROM ubuntu:18.04 as copies-layer
ARG ANALYTICS_ZOO_VERSION

COPY --from=analytics-zoo /opt/analytics-zoo-${ANALYTICS_ZOO_VERSION} /opt/analytics-zoo-${ANALYTICS_ZOO_VERSION}
COPY --from=analytics-zoo /opt/analytics-zoo-examples/python /opt/analytics-zoo-examples/python
COPY --from=spark /opt/jdk /opt/jdk
COPY --from=spark /opt/spark /opt/spark
COPY --from=spark /opt/spark/kubernetes/dockerfiles/spark/entrypoint.sh /opt


# stage.4
FROM ubuntu:18.04
MAINTAINER The Analytics-Zoo Authors https://github.com/intel-analytics/analytics-zoo
ARG ANALYTICS_ZOO_VERSION
ARG BIGDL_VERSION
ARG SPARK_VERSION
ARG SPARK_HOME
ARG TINI_VERSION

ENV ANALYTICS_ZOO_VERSION ${ANALYTICS_ZOO_VERSION}
ENV SPARK_HOME ${SPARK_HOME}
ENV SPARK_VERSION ${SPARK_VERSION}
ENV ANALYTICS_ZOO_HOME /opt/analytics-zoo-${ANALYTICS_ZOO_VERSION}
ENV FLINK_HOME /opt/flink-${FLINK_VERSION}
ENV OMP_NUM_THREADS 4
ENV NOTEBOOK_PORT 12345
ENV NOTEBOOK_TOKEN 1234qwer
ENV RUNTIME_SPARK_MASTER local[4]
ENV RUNTIME_K8S_SERVICE_ACCOUNT spark
ENV RUNTIME_K8S_SPARK_IMAGE intelanalytics/hyper-zoo:${ANALYTICS_ZOO_VERSION}-${SPARK_VERSION}
ENV RUNTIME_DRIVER_HOST localhost
ENV RUNTIME_DRIVER_PORT 54321
ENV RUNTIME_EXECUTOR_CORES 4
ENV RUNTIME_EXECUTOR_MEMORY 20g
ENV RUNTIME_EXECUTOR_INSTANCES 1
ENV RUNTIME_TOTAL_EXECUTOR_CORES 4
ENV RUNTIME_DRIVER_CORES 4
ENV RUNTIME_DRIVER_MEMORY 10g
ENV RUNTIME_PERSISTENT_VOLUME_CLAIM myvolumeclaim
ENV SPARK_HOME /opt/spark
ENV HADOOP_CONF_DIR /opt/hadoop-conf
ENV BIGDL_VERSION ${BIGDL_VERSION}
ENV BIGDL_CLASSPATH ${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-jar-with-dependencies.jar
ENV JAVA_HOME /opt/jdk
ENV REDIS_HOME /opt/redis-5.0.5
ENV CS_HOME /opt/work/cluster-serving
ENV PYTHONPATH ${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-python-api.zip:${SPARK_HOME}/python/lib/pyspark.zip:${SPARK_HOME}/python/lib/py4j-*.zip:${CS_HOME}/serving-python.zip:/opt/models/research/slim
ENV PATH ${ANALYTICS_ZOO_HOME}/bin/cluster-serving:${JAVA_HOME}/bin:/root/miniconda3/bin:${PATH}
ENV TINI_VERSION ${TINI_VERSION}
ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8


COPY --from=copies-layer /opt /opt
COPY --from=spark /sbin/tini /sbin/tini
ADD ./start-notebook-spark.sh /opt
ADD ./start-notebook-k8s.sh /opt

RUN mkdir -p /opt/analytics-zoo-examples/python && \
mkdir -p /opt/analytics-zoo-examples/scala && \
apt-get update --fix-missing && \
apt-get install -y apt-utils vim curl nano wget unzip maven git && \
apt-get install -y gcc g++ make && \
apt-get install -y libsm6 libxext6 libxrender-dev && \
rm /bin/sh && \
ln -sv /bin/bash /bin/sh && \
echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \
chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \
# python
apt-get install -y python3-minimal && \
apt-get install -y build-essential python3 python3-setuptools python3-dev python3-pip && \
pip3 install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir --upgrade setuptools && \
pip install --no-cache-dir numpy==1.18.1 scipy && \
pip install --no-cache-dir pandas==1.0.3 && \
pip install --no-cache-dir scikit-learn matplotlib seaborn jupyter jupyterlab requests h5py && \
ln -s /usr/bin/python3 /usr/bin/python && \
#Fix tornado await process
pip uninstall -y -q tornado && \
pip install --no-cache-dir tornado && \
python3 -m ipykernel.kernelspec && \
pip install --no-cache-dir tensorboard && \
pip install --no-cache-dir jep && \
pip install --no-cache-dir cloudpickle && \
pip install --no-cache-dir opencv-python && \
pip install --no-cache-dir pyyaml && \
pip install --no-cache-dir redis && \
pip install --no-cache-dir ray[tune]==1.2.0 && \
pip install --no-cache-dir Pillow==6.1 && \
pip install --no-cache-dir psutil aiohttp && \
pip install --no-cache-dir py4j && \
pip install --no-cache-dir cmake==3.16.3 && \
pip install --no-cache-dir torch==1.7.1 torchvision==0.8.2 && \
pip install --no-cache-dir horovod==0.19.2 && \
#tf2
pip install --no-cache-dir pyarrow && \
pip install opencv-python==4.2.0.34 && \
pip install aioredis==1.1.0 && \
pip install tensorflow==2.4.0 && \
# chmod
chmod a+x /opt/start-notebook-spark.sh && \
chmod a+x /opt/start-notebook-k8s.sh && \
chmod +x /sbin/tini && \
cp /sbin/tini /usr/bin/tini

WORKDIR /opt/spark/work-dir

ENTRYPOINT [ "/opt/entrypoint.sh" ]
Loading

0 comments on commit 77da6ed

Please sign in to comment.