Kubernetes 部署 DolphinScheduler 集群
一、下载
先决条件
- Helm 3.1.0+
- Kubernetes 1.12+
- PV 供应(需要基础设施支持)
安装 dolphinscheduler
1、下载安装包
wget --no-check-certificate https://dlcdn.apache.org/dolphinscheduler/2.0.5/apache-dolphinscheduler-2.0.5-src.tar.gz
$ tar -zxvf apache-dolphinscheduler-2.0.5-src.tar.gz
$ cd apache-dolphinscheduler-2.0.5-src/docker/kubernetes/dolphinscheduler
下载源代码后,更改路径 apache-dolphinscheduler-2.0.5-src/docker/kubernetes/dolphinscheduler 中的 Chart.yaml
文件,需要同时修改两个地方, 将 repository: https://charts.bitnami.com/bitnami 替换成 repository: https://raw.githubusercontent.com/bitnami/charts/archive-full-index/bitnami
Chart.yaml
[root@k8s-master01 dolphinscheduler]# cat Chart.yaml
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
apiVersion: v2
name: dolphinscheduler
description: Dolphin Scheduler is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing.
home: https://dolphinscheduler.apache.org
icon: https://dolphinscheduler.apache.org/img/hlogo_colorful.svg
keywords:
- dolphinscheduler
- scheduler
# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
version: 2.0.3
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application.
appVersion: 2.0.5
dependencies:
- name: postgresql
version: 10.3.18
repository: https://raw.githubusercontent.com/bitnami/charts/archive-full-index/bitnami
condition: postgresql.enabled
- name: zookeeper
version: 6.5.3
repository: https://raw.githubusercontent.com/bitnami/charts/archive-full-index/bitnami
condition: zookeeper.enabled
[root@k8s-master01 dolphinscheduler]#
2、用 MySQL 作为 DolphinScheduler 的数据库
如何用 MySQL 替代 PostgreSQL 作为 DolphinScheduler 的数据库?
- 下载 MySQL 驱动包 mysql-connector-java-8.0.16.jar
cd /root/softwares/ds/ds-image
wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.16/mysql-connector-java-8.0.16.jar
- 创建一个新的 Dockerfile,用于添加 MySQL 的驱动包:
编写Dockerfile,这里添加Python的环境MiniConda
FROM dolphinscheduler.docker.scarf.sh/apache/dolphinscheduler:2.0.5
COPY mysql-connector-java-8.0.16.jar /opt/dolphinscheduler/lib
# System packages
RUN sed -i s@/archive.ubuntu.com/@/mirrors.aliyun.com/@g /etc/apt/sources.list
RUN apt-get clean
RUN apt-get update && \
apt-get install -y curl && \
apt-get install -y expect && \
apt-get install -y tar && \
apt-get install -y vim && \
apt-get install -y telnet && \
apt-get install -y net-tools && \
apt-get install -y iputils-ping
# RUN apt-get update && apt-get install -yq curl wget jq vim
# python env
ARG CONDA_VER=4.12.0
ARG OS_TYPE=x86_64
ARG PY_VER=3.8
ARG PY_VER_CONDA=py38
ARG PANDAS_VER=1.3
# Use the above args
# ARG CONDA_VER
# ARG OS_TYPE
# ARG PY_VER_CONDA
# Install miniconda to /miniconda
# https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh
RUN curl -LO "http://repo.continuum.io/miniconda/Miniconda3-${PY_VER_CONDA}_${CONDA_VER}-Linux-${OS_TYPE}.sh"
RUN bash Miniconda3-${PY_VER_CONDA}_${CONDA_VER}-Linux-${OS_TYPE}.sh -p /miniconda -b
RUN rm Miniconda3-${PY_VER_CONDA}_${CONDA_VER}-Linux-${OS_TYPE}.sh
ENV PATH=/miniconda/bin:${PATH}
RUN conda update -y conda
RUN conda init
# ARG PY_VER
# ARG PANDAS_VER
# Install packages from conda
RUN conda install -c anaconda -y python=${PY_VER}
RUN conda install -c anaconda -y \
pandas=${PANDAS_VER}
- 构建一个包含 MySQL 驱动包的新镜像:
docker build -t apache/dolphinscheduler:mysql-driver .
编译:
[root@quant image]# docker build -t apache/dolphinscheduler:mysql-driver .
...
Downloading and Extracting Packages
openssl-1.1.1q | 3.8 MB | ########## | 100%
python-3.8.13 | 22.7 MB | ########## | 100%
ca-certificates-2022 | 131 KB | ########## | 100%
certifi-2022.6.15 | 156 KB | ########## | 100%
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Retrieving notices: ...working... done
Removing intermediate container 394a97a0668b
---> c45736e37149
Step 20/20 : RUN conda install -c anaconda -y pandas=${PANDAS_VER}
---> Running in a28f9dfe2152
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: /miniconda
added / updated specs:
- pandas
The following packages will be downloaded:
package | build
---------------------------|-----------------
blas-1.0 | mkl 6 KB anaconda
bottleneck-1.3.5 | py38h7deecbd_0 125 KB anaconda
intel-openmp-2021.4.0 | h06a4308_3561 8.8 MB anaconda
mkl-2021.4.0 | h06a4308_640 219.1 MB anaconda
mkl-service-2.4.0 | py38h7f8727e_0 62 KB anaconda
mkl_fft-1.3.1 | py38hd3c417c_0 200 KB anaconda
mkl_random-1.2.2 | py38h51133e4_0 341 KB anaconda
numexpr-2.8.3 | py38h807cd23_0 133 KB anaconda
numpy-1.23.1 | py38h6c91a56_0 10 KB anaconda
numpy-base-1.23.1 | py38ha15fc14_0 7.1 MB anaconda
packaging-21.3 | pyhd3eb1b0_0 35 KB anaconda
pandas-1.4.3 | py38h6a678d5_0 12.6 MB anaconda
pyparsing-3.0.4 | pyhd3eb1b0_0 78 KB anaconda
python-dateutil-2.8.2 | pyhd3eb1b0_0 241 KB anaconda
pytz-2022.1 | py38h06a4308_0 243 KB anaconda
six-1.16.0 | pyhd3eb1b0_1 19 KB anaconda
------------------------------------------------------------
Total: 249.1 MB
The following NEW packages will be INSTALLED:
blas anaconda/linux-64::blas-1.0-mkl
bottleneck anaconda/linux-64::bottleneck-1.3.5-py38h7deecbd_0
intel-openmp anaconda/linux-64::intel-openmp-2021.4.0-h06a4308_3561
mkl anaconda/linux-64::mkl-2021.4.0-h06a4308_640
mkl-service anaconda/linux-64::mkl-service-2.4.0-py38h7f8727e_0
mkl_fft anaconda/linux-64::mkl_fft-1.3.1-py38hd3c417c_0
mkl_random anaconda/linux-64::mkl_random-1.2.2-py38h51133e4_0
numexpr anaconda/linux-64::numexpr-2.8.3-py38h807cd23_0
numpy anaconda/linux-64::numpy-1.23.1-py38h6c91a56_0
numpy-base anaconda/linux-64::numpy-base-1.23.1-py38ha15fc14_0
packaging anaconda/noarch::packaging-21.3-pyhd3eb1b0_0
pandas anaconda/linux-64::pandas-1.4.3-py38h6a678d5_0
pyparsing anaconda/noarch::pyparsing-3.0.4-pyhd3eb1b0_0
python-dateutil anaconda/noarch::python-dateutil-2.8.2-pyhd3eb1b0_0
pytz anaconda/linux-64::pytz-2022.1-py38h06a4308_0
six anaconda/noarch::six-1.16.0-pyhd3eb1b0_1
Downloading and Extracting Packages
numpy-1.23.1 | 10 KB | ########## | 100%
pandas-1.4.3 | 12.6 MB | ########## | 100%
mkl_random-1.2.2 | 341 KB | ########## | 100%
intel-openmp-2021.4. | 8.8 MB | ########## | 100%
packaging-21.3 | 35 KB | ########## | 100%
python-dateutil-2.8. | 241 KB | ########## | 100%
pytz-2022.1 | 243 KB | ########## | 100%
pyparsing-3.0.4 | 78 KB | ########## | 100%
numpy-base-1.23.1 | 7.1 MB | ########## | 100%
bottleneck-1.3.5 | 125 KB | ########## | 100%
blas-1.0 | 6 KB | ########## | 100%
mkl-service-2.4.0 | 62 KB | ########## | 100%
numexpr-2.8.3 | 133 KB | ########## | 100%
mkl_fft-1.3.1 | 200 KB | ########## | 100%
six-1.16.0 | 19 KB | ########## | 100%
mkl-2021.4.0 | 219.1 MB | ########## | 100%
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Retrieving notices: ...working... done
Removing intermediate container a28f9dfe2152
---> 5f02810f6511
Successfully built 5f02810f6511
Successfully tagged apache/dolphinscheduler:mysql-driver
[root@quant image]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
apache/dolphinscheduler mysql-driver 5f02810f6511 42 seconds ago 2.5GB
<none> <none> 58b827c9ff7b 17 minutes ago 434MB
-
推送 docker 镜像
apache/dolphinscheduler:mysql-driver
到一个 docker registry 中 -
修改 values.yaml 文件中 image 的 repository 字段,并更新 tag 为 mysql-driver
-
修改 values.yaml 文件中 postgresql 的 enabled 为 false
-
修改 values.yaml 文件中的 externalDatabase 配置 (尤其修改 host, username 和 password)
externalDatabase: type: "mysql" driver: "com.mysql.jdbc.Driver" host: "localhost" port: "3306" username: "root" password: "root" database: "dolphinscheduler" params: "useUnicode=true&characterEncoding=UTF-8"
8、Python环境更改
修改 values.yaml 文件中的 PYTHON_HOME 为 /usr/bin/python3
9、部署
$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm dependency update .
$ helm install dolphinscheduler . --set image.tag=2.0.5
相关文章:
官网 | 快速试用 Kubernetes 部署Dolphinscheduler
为者常成,行者常至
自由转载-非商用-非衍生-保持署名(创意共享3.0许可证)