Skip to content

Kubernetes + RAGFlow 极速部署指南

摘要: 本指南提供了一套经过验证的“黄金路径”,帮助用户在单台 Ubuntu 22.04 虚拟机 (8C/50G) 上,从零构建 Kubernetes (v1.30) 集群,并一键部署 RAGFlow 智能检索系统。方案集成了国内镜像加速、内核参数优化及存储持久化配置,适合企业 PoC 及研发测试。

阶段一:Kubernetes 集群初始化

1. 系统基础配置

执行权限: sudoroot

  1. 更新系统与安装基础工具:

    bash
    apt update && apt install -y apt-transport-https ca-certificates curl gnupg lsb-release vim
  2. 禁用 Swap (K8s 硬性要求):

    bash
    swapoff -a
    sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
  3. 内核参数优化 (启用桥接流量转发):

    bash
    cat <<EOF | tee /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.ipv4.ip_forward                 = 1
    EOF
    sysctl --system
    modprobe br_netfilter

2. 安装 Containerd (容器运行时)

  1. 添加 Docker 官方源:

    bash
    curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
    echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
  2. 安装 Containerd:

    bash
    apt update && apt install -y containerd.io
  3. 应用黄金配置 (SystemdCgroup + 国内镜像加速):

    bash
    mkdir -p /etc/containerd
    containerd config default > /etc/containerd/config.toml
    
    # 关键修改:开启 SystemdCgroup,替换 Sandbox 镜像,添加阿里云加速器
    sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
    sed -i 's|sandbox_image = "registry.k8s.io/pause:3.8"|sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"|g' /etc/containerd/config.toml
    
    # 追加镜像加速配置
    cat <<EOF >> /etc/containerd/config.toml
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
      endpoint = ["https://vgq57xf0.mirror.aliyuncs.com", "https://registry.aliyuncs.com"]
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"]
      endpoint = ["https://registry.aliyuncs.com/google_containers"]
    EOF
    
    systemctl restart containerd

3. 安装 Kubernetes 组件

  1. 添加 K8s 源 (v1.30):

    bash
    curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
    echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | tee /etc/apt/sources.list.d/kubernetes.list
  2. 安装并锁定版本:

    bash
    apt update && apt install -y kubelet kubeadm kubectl
    apt-mark hold kubelet kubeadm kubectl

4. 初始化集群

  1. 生成初始化配置: (请将 controlPlaneEndpoint 替换为你虚拟机的实际 IP)

    yaml
    cat <<EOF > kubeadm-config.yaml
    apiVersion: kubeadm.k8s.io/v1beta3
    kind: ClusterConfiguration
    kubernetesVersion: "v1.30.0"
    imageRepository: "registry.aliyuncs.com/google_containers"
    controlPlaneEndpoint: "192.168.161.158:6443"
    networking:
      podSubnet: "10.244.0.0/16"
    ---
    apiVersion: kubelet.config.k8s.io/v1beta1
    kind: KubeletConfiguration
    cgroupDriver: systemd
    EOF
  2. 执行初始化:

    bash
    kubeadm init --config kubeadm-config.yaml
  3. 配置 Kubectl:

    bash
    mkdir -p $HOME/.kube
    cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    chown $(id -u):$(id -g) $HOME/.kube/config
  4. 安装网络插件 (Flannel) 并解除 Master 隔离:

    bash
    kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
    kubectl taint nodes --all node-role.kubernetes.io/control-plane-

阶段二:RAGFlow 一键部署

1. 预拉取镜像 (加速启动)

建议手动拉取大镜像,避免部署超时。

bash
crictl pull elasticsearch:8.11.1
crictl pull infiniflow/ragflow:v0.20.5-slim
crictl pull minio/minio:latest
crictl pull mysql:8.0
crictl pull redis:7

2. 准备持久化目录

bash
mkdir -p /mnt/ragflow-data/{mysql,minio,es}
chmod -R 777 /mnt/ragflow-data

3. 部署应用

创建 ragflow-all-in-one.yaml(内容较长,建议参考官方或以下简化版):

点击查看简易版部署 YAML
yaml
apiVersion: v1
kind: Namespace
metadata:
  name: ragflow
---
# 此处省略 MySQL, Redis, MinIO, ES 的 StatefulSet 定义
# 核心是 RAGFlow Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ragflow
  namespace: ragflow
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ragflow
  template:
    metadata:
      labels:
        app: ragflow
    spec:
      containers:
      - name: ragflow
        image: infiniflow/ragflow:v0.20.5-slim
        ports:
        - containerPort: 9380
        env:
        - name: MYSQL_HOST
          value: "mysql"
        # ... 其他环境变量
---
apiVersion: v1
kind: Service
metadata:
  name: ragflow
  namespace: ragflow
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 9380
    nodePort: 30001
  selector:
    app: ragflow

执行部署:

bash
kubectl apply -f ragflow-all-in-one.yaml

4. 访问验证

等待所有 Pod 状态变为 Running

bash
kubectl get pods -n ragflow -w

访问地址: http://<虚拟机IP>:30001

附录:故障排查

现象可能原因解决方案
kubeadm init 卡住镜像拉取慢检查 imageRepository 是否为阿里云
Pod ContainerCreating网络不通检查 CNI 插件状态或重启 Containerd
RAGFlow 无法连接 DB环境变量错误检查 Service 名称与 YAML 中的 ENV 是否一致

AI-HPC Organization