从零开始使用 IsaacLab 训练机器人行走

IsaacLab 入门教程

1. 环境配置

1.1 推荐配置

操作系统: Ubuntu 22.04 LTS
显卡: NVIDIA RTX 4080 或以上

1.2 Ubuntu 22.04 LTS 安装

参考官方 Ubuntu 安装文档，将镜像文件替换为 Ubuntu 22.04 版本，其他步骤保持不变。建议 /home 与 /usr 的硬盘容量均不少于 200GB。

1.3 安装 NVIDIA 驱动

根据自身显卡型号与操作系统，选择对应的显卡驱动，建议选择 550.xxx 版本的显卡驱动，按照官方教程进行安装即可。安装完成后在终端输入 nvidia-smi，若出现以下信息则表示驱动安装成功：

Thu Jun 5 15:49:51 2025 +-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 |
+-----------------------------------------------------------------------------------------+

1.4 安装 CUDA, cuDNN

1.4.1 安装 CUDA

根据前面的 NVIDIA 显卡驱动的信息显示，选择对应的 CUDA 版本进行安装。例如上面的信息显示 CUDA Version: 12.4，则可安装 CUDA 12.4 版本及更低版本。可以在 NVIDIA 的 CUDA Toolkit Archive 中，根据本机的操作系统找到对应版本的安装包。

具体安装步骤：

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4

安装后设置环境变量：

echo 'export PATH=/usr/local/cuda-12.4/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
echo 'export CUDA_HOME=/usr/local/cuda-12.4' >> ~/.bashrc
source ~/.bashrc

验证 CUDA 安装是否成功，可以在终端输入以下命令：

nvcc --version

如果安装成功，会显示 CUDA 的版本信息。

1.4.2 安装 cuDNN

前往 NVIDIA 的 cuDNN Archive 下载对应版本的 cuDNN。选择与 CUDA 版本相匹配的 cuDNN 版本，比如 CUDA 12.4 对应的 cuDNN 版本为 8.9.2。

找到下载的 cudnn 的 deb 文件所在目录，打开终端，执行下述命令：

sudo dpkg -i cudnn-local-repo-ubuntu2204-xxxxxxxxxxxxxx.deb
sudo cp /var/cudnn-local-repo-ubuntu2204-9.10.1/cudnn-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudnn-cuda-12

1.5 安装 PyTorch

根据前面下载的 CUDA 版本，在 PyTorch 官网找到对应的安装指令，在自己的 Python 环境中执行以下命令安装 PyTorch（可以等后续创建了 Python 环境后进行安装），以下命令适用于 CUDA 12.4 版本的 PyTorch 安装：

conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.4 -c pytorch -c nvidia

1.6 安装 Anaconda

参考 Anaconda 官方安装教程。

1.7 安装 Isaac Sim

可以参考 Isaac Sim 官方安装教程，以下是简要步骤：

conda create -n env_isaaclab python=3.10
conda activate env_isaaclab
pip install --upgrade pip
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.4 -c pytorch -c nvidia
pip install 'isaacsim[all,extscache]==4.5.0' --extra-index-url https://pypi.nvidia.com

首次运行 Isaac Sim 需要较长时间，第一次运行将提示用户接受 NVIDIA 软件许可协议。要接受 EULA，请在出现以下消息提示时回复：

By installing or using Isaac Sim, I agree to the terms of NVIDIA SOFTWARE LICENSE AGREEMENT (EULA) in https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement
Do you accept the EULA? (Yes/No): Yes

1.8 资产配置

在 Isaac Sim 的官方网站下载 4.5 版本的 Isaac Sim 及 Isaac Sim 的资产包，资产包约为 92GB。

按照官方教程进行资产配置。此外，使用 VSCode 进入 Isaac Sim 的库文件目录，在全局搜索中搜索默认的资产目录：

https://omniverse-content-production.s3-us-west-2.amazonaws.com/Assets/Isaac/4.5

全部替换成自己的资产目录：

https://your-asset-directory/Isaac/4.5

1.9 安装 Isaac Lab

可以通过 Git 拉取 Isaac Lab 的代码库，或者从官方 GitHub 页面下载 zip 包，以下是通过 Git 拉取的命令：

git clone [email protected]:isaac-sim/IsaacLab.git

安装后进入 isaaclab 目录，执行以下命令安装相关依赖：

./isaaclab.sh --install

验证安装是否成功，在 isaaclab.sh 所在目录执行：

./isaaclab.sh -p scripts/tutorials/00_sim/create_empty.py

2. 机器人注册

2.1 机器人 USD 文件获取

可以通过 SolidWorks 导出机器人的 URDF 文件，然后打开 Isaac Sim，点击左上角的 File -> Import -> Import URDF，选择导出的 urdf 文件，导入后会生成一个 usd 文件，保存到指定目录。若是移动机器人，记得选择可动基座。若导入时卡住或者报错，检查 URDF 的完整性，检查 STL 路径，若 package://... 的路径格式不可行，建议替换成相对或绝对路径。

2.2 机器人注册

cd ~/isaaclab/source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/config

复制已有的 h1 文件夹，重命名为你的机器人名称，比如这里使用 op3，创建一个 op3.py 作为 op3 机器人的参数文件，当前的目录结构应如下，按需求补充缺失脚本：

isaaclab_tasks
└── manager_based
    └── locomotion
        └── velocity
            ├── config
            │   └── op3
            │       ├── agents
            │       │   ├── __init__.py
            │       │   ├── rsl_rl_ppo_cfg.py
            │       │   ├── skrl_flat_ppo_cfg.yaml
            │       │   ├── skrl_rough_ppo_cfg.yaml
            │       │   └── op3.py
            │       ├── flat_env_cfg.py
            │       ├── rough_env_cfg.py
            │       ├── check_op3.py
            │       ├── velocity_env_cfg.py
            │       └── op3_velocity_env_cfg.py
            └── __init__.py

2.2.1 机器人参数文件编写

机器人参数文件：

机器人参数文件 op3.py 的内容如下：

import isaaclab.sim as sim_utils
from isaaclab.actuators import ImplicitActuatorCfg
from isaaclab.assets.articulation import ArticulationCfg
from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR

OP3_CFG = ArticulationCfg(
    spawn=sim_utils.UsdFileCfg(
        usd_path=f"xxx/op3.usd",  # 替换为你前面导出的机器人 usd 文件路径
        activate_contact_sensors=True,
        rigid_props=sim_utils.RigidBodyPropertiesCfg(
            disable_gravity=False,
            retain_accelerations=False,
            linear_damping=0.0,
            angular_damping=0.0,
            max_linear_velocity=20.0,
            max_angular_velocity=20.0,
            max_depenetration_velocity=10.0,
        ),
        articulation_props=sim_utils.ArticulationRootPropertiesCfg(
            enabled_self_collisions=True,
            solver_position_iteration_count=4,
            solver_velocity_iteration_count=4,
        ),
    ),
    init_state=ArticulationCfg.InitialStateCfg(
        pos=(0.0, 0.0, 0.28),  # 机器人初始位置，具体高度可由脚本 check_op3.py 查看
        joint_pos={
            "head_pan": 0.0,
            "head_tilt": 0.0,
            ".*_hip_yaw": 0.0,
            ".*_hip_roll": 0.0,
            ".*_hip_pitch": 0.0,
            ".*_knee": 0.0,
            ".*_ank_pitch": 0.0,
            ".*_ank_roll": ,
            : ,
            : ,
            : ,
        },
        joint_vel={: },
    ),
    soft_joint_pos_limit_factor=,
    actuators={
        : ImplicitActuatorCfg(
            joint_names_expr=[, , , , , ],
            effort_limit_sim=,
            velocity_limit_sim=,
            stiffness={
                : ,
                : ,
                : ,
                : ,
                : ,
                : ,
            },
            damping={
                : ,
                : ,
                : ,
                : ,
                : ,
                : ,
            },
        ),
        : ImplicitActuatorCfg(
            joint_names_expr=[, , ],
            effort_limit_sim=,
            velocity_limit_sim=,
            stiffness={
                : ,
                : ,
                : ,
            },
            damping={
                : ,
                : ,
                : ,
            },
        ),
        : ImplicitActuatorCfg(
            joint_names_expr=[, ],
            effort_limit_sim=,
            velocity_limit_sim=,
            stiffness={: , : },
            damping={: , : },
        ),
    },
)

机器人基座高度检查脚本：

检查机器人基座高度 check_op3.py 的内容如下，可以先在 op3.py 中给机器人一个较高的初始基座高度，然后在这个脚本的输出中查看实际落地时机器人的基座高度，以调整机器人初始基座高度：

import argparse
import torch
from isaaclab.app import AppLauncher

parser = argparse.ArgumentParser(description="This script demonstrates how to simulate bipedal robots.")
AppLauncher.add_app_launcher_args(parser)
args_cli = parser.parse_args()
app_launcher = AppLauncher(args_cli)
simulation_app = app_launcher.app

import isaaclab.sim as sim_utils
from isaaclab.assets import Articulation
from isaaclab.sim import SimulationContext

# 改为你的机器人的参数配置脚本：
from op3 import OP3_CFG

def design_scene(sim: sim_utils.SimulationContext) -> tuple[list, torch.Tensor]:
    """Designs the scene."""
    cfg = sim_utils.GroundPlaneCfg()
    cfg.func("/World/defaultGroundPlane", cfg)
    cfg = sim_utils.DomeLightCfg(intensity=2000.0, color=(0.75, 0.75, 0.75))
    cfg.func("/World/Light", cfg)
    origins = torch.tensor([[0.0, -1.0, 0.0], [0.0, 0.0, 0.0], [0.0, 1.0, 0.0],]).to(device=sim.device)
    hr2 = Articulation(OP3_CFG.replace(prim_path="/World/G1"))
    robots = [hr2]
    return robots, origins

def run_simulator(sim: sim_utils.SimulationContext, robots: list[Articulation], origins: torch.Tensor):
    """Runs the simulation loop."""
    sim_dt = sim.get_physics_dt()
    sim_time = 
    count = 
     simulation_app.is_running():
         count %  == :
            sim_time = 
            count = 
         index, robot  (robots):
            joint_pos, joint_vel = robot.data.default_joint_pos, robot.data.default_joint_vel
            robot.write_joint_state_to_sim(joint_pos, joint_vel)
            root_state = robot.data.default_root_state.clone()
            root_state[:, :] += origins[index]
            robot.write_root_pose_to_sim(root_state[:, :])
            robot.write_root_velocity_to_sim(root_state[:, :])
            robot.reset()
            ()
         robot  robots:
            robot.set_joint_position_target(robot.data.default_joint_pos.clone())
            robot.write_data_to_sim()
            sim.step()
            sim_time += sim_dt
            count += 
         robot  robots:
            robot.update(sim_dt)
            root_pos = robot.data.root_pos_w  
            ()  

 ():
    sim_cfg = sim_utils.SimulationCfg(dt=, device=args_cli.device, gravity=[, , -])
    sim = SimulationContext(sim_cfg)
    sim.set_camera_view(eye=[, , ], target=[, , ])
    robots, origins = design_scene(sim)
    sim.reset()
    ()
    run_simulator(sim, robots, origins)

 __name__ == :
    main()
    simulation_app.close()

2.2.2 速度环境配置文件编写

速度环境配置文件 op3_velocity_env_cfg.py 的内容如下，可以先复制原有的 velocity_env_cfg.py，然后对照着修改，需要修改的地方有注释说明：

# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
import math
from dataclasses import MISSING
import isaaclab.sim as sim_utils
from isaaclab.assets import ArticulationCfg, AssetBaseCfg
from isaaclab.envs import ManagerBasedRLEnvCfg
from isaaclab.managers import CurriculumTermCfg as CurrTerm
from isaaclab.managers import EventTermCfg as EventTerm
from isaaclab.managers import ObservationGroupCfg as ObsGroup
from isaaclab.managers import ObservationTermCfg as ObsTerm
from isaaclab.managers import RewardTermCfg as RewTerm
from isaaclab.managers import SceneEntityCfg
from isaaclab.managers import TerminationTermCfg as DoneTerm
from isaaclab.scene import InteractiveSceneCfg
from isaaclab.sensors import ContactSensorCfg, RayCasterCfg, patterns
from isaaclab.terrains import TerrainImporterCfg
from isaaclab.utils import configclass
from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR, ISAACLAB_NUCLEUS_DIR
from isaaclab.utils.noise import AdditiveUniformNoiseCfg  Unoise
 isaaclab_tasks.manager_based.locomotion.velocity.mdp  mdp
 isaaclab.terrains.config.rough  ROUGH_TERRAINS_CFG
 isaaclab.managers  ActionTermCfg


 ():
    
    
    terrain = TerrainImporterCfg(
        prim_path=,
        terrain_type=,
        terrain_generator=ROUGH_TERRAINS_CFG,
        max_init_terrain_level=,
        collision_group=-,
        physics_material=sim_utils.RigidBodyMaterialCfg(
            friction_combine_mode=,
            restitution_combine_mode=,
            static_friction=,
            dynamic_friction=,
        ),
        visual_material=sim_utils.MdlFileCfg(
            mdl_path=,
            project_uvw=,
            texture_scale=(, ),
        ),
        debug_vis=,
    )
    robot: ArticulationCfg = MISSING
    height_scanner = RayCasterCfg(
        prim_path=,  
        offset=RayCasterCfg.OffsetCfg(pos=(, , )),
        attach_yaw_only=,
        pattern_cfg=patterns.GridPatternCfg(resolution=, size=[, ]),
        debug_vis=,
        mesh_prim_paths=[],
    )
    contact_forces = ContactSensorCfg(prim_path=, history_length=, track_air_time=)
    sky_light = AssetBaseCfg(
        prim_path=,
        spawn=sim_utils.DomeLightCfg(
            intensity=,
            texture_file=,
        ),
    )


 :
    
    
    base_velocity = mdp.UniformVelocityCommandCfg(
        asset_name=,  
        resampling_time_range=(, ),  
        rel_standing_envs=,  
        rel_heading_envs=,  
        heading_command=,  
        heading_control_stiffness=,  
        debug_vis=,  
        ranges=mdp.UniformVelocityCommandCfg.Ranges(
            lin_vel_x=(-, ),  
            lin_vel_y=(-, ),  
            ang_vel_z=(-, ),  
            heading=(-math.pi, math.pi),  
        ),
    )


 :
    
    joint_pos = mdp.JointPositionActionCfg(asset_name=, joint_names=[], scale=, use_default_offset=)


 :
    

     ():
        
        base_lin_vel = ObsTerm(func=mdp.base_lin_vel, noise=Unoise(n_min=-, n_max=))
        base_ang_vel = ObsTerm(func=mdp.base_ang_vel, noise=Unoise(n_min=-, n_max=))
        projected_gravity = ObsTerm(func=mdp.projected_gravity, noise=Unoise(n_min=-, n_max=))
        velocity_commands = ObsTerm(func=mdp.generated_commands, params={: })
        joint_pos = ObsTerm(func=mdp.joint_pos_rel, noise=Unoise(n_min=-, n_max=))
        joint_vel = ObsTerm(func=mdp.joint_vel_rel, noise=Unoise(n_min=-, n_max=))
        actions = ObsTerm(func=mdp.last_action)
        height_scan = ObsTerm(
            func=mdp.height_scan,
            params={: SceneEntityCfg()},
            noise=Unoise(n_min=-, n_max=),
            clip=(-, ),
        )

         ():
            .enable_corruption = 
            .concatenate_terms = 

    policy: PolicyCfg = PolicyCfg()


 :
    
    physics_material = EventTerm(
        func=mdp.randomize_rigid_body_material,
        mode=,
        params={
            : SceneEntityCfg(, body_names=),
            : (, ),
            : (, ),
            : (, ),
            : ,
        },
    )
    add_base_mass = EventTerm(
        func=mdp.randomize_rigid_body_mass,
        mode=,
        params={
            : SceneEntityCfg(, body_names=),  
            : (-, ),
            : ,
        },
    )
    base_external_force_torque = EventTerm(
        func=mdp.apply_external_force_torque,
        mode=,
        params={
            : SceneEntityCfg(, body_names=),  
            : (, ),
            : (-, ),
        },
    )
    reset_base = EventTerm(
        func=mdp.reset_root_state_uniform,
        mode=,
        params={
            : {: (-, ), : (-, ), : (-, )},
            : {: (-, ), : (-, ), : (-, ), : (-, ), : (-, ), : (-, )},
        },
    )
    reset_robot_joints = EventTerm(
        func=mdp.reset_joints_by_scale,
        mode=,
        params={: (, ), : (, )},
    )
    push_robot = EventTerm(
        func=mdp.push_by_setting_velocity,
        mode=,
        interval_range_s=(, ),
        params={: {: (-, ), : (-, )}},
    )


 :
    
    track_lin_vel_xy_exp = RewTerm(func=mdp.track_lin_vel_xy_exp, weight=, params={: , : math.sqrt()})
    track_ang_vel_z_exp = RewTerm(func=mdp.track_ang_vel_z_exp, weight=, params={: , : math.sqrt()})
    
    lin_vel_z_l2 = RewTerm(func=mdp.lin_vel_z_l2, weight=-)
    ang_vel_xy_l2 = RewTerm(func=mdp.ang_vel_xy_l2, weight=-)
    dof_torques_l2 = RewTerm(func=mdp.joint_torques_l2, weight=-)
    dof_acc_l2 = RewTerm(func=mdp.joint_acc_l2, weight=-)
    action_rate_l2 = RewTerm(func=mdp.action_rate_l2, weight=-)
    feet_air_time = RewTerm(
        func=mdp.feet_air_time,
        weight=,
        params={: SceneEntityCfg(, body_names=), : , : },
    )
    flat_orientation_l2 = RewTerm(func=mdp.flat_orientation_l2, weight=)
    dof_pos_limits = RewTerm(func=mdp.joint_pos_limits, weight=)


 :
    
    time_out = DoneTerm(func=mdp.time_out, time_out=)
    base_contact = DoneTerm(
        func=mdp.illegal_contact,
        params={: SceneEntityCfg(, body_names=), : },
    )


 :
    
    terrain_levels = CurrTerm(func=mdp.terrain_levels_vel)


 ():
    
    scene: MySceneCfg = MySceneCfg(num_envs=, env_spacing=)
    observations: ObservationsCfg = ObservationsCfg()
    actions: ActionsCfg = ActionsCfg()
    commands: CommandsCfg = CommandsCfg()
    rewards: RewardsCfg = RewardsCfg()
    terminations: TerminationsCfg = TerminationsCfg()
    events: EventCfg = EventCfg()
    curriculum: CurriculumCfg = CurriculumCfg()

     ():
        
        .decimation = 
        .episode_length_s = 
        
        .sim.dt = 
        .sim.render_interval = .decimation
        .sim.physics_material = .scene.terrain.physics_material
        .sim.physx.gpu_max_rigid_patch_count =  *  ** 
         .scene.height_scanner   :
            .scene.height_scanner.update_period = .decimation * .sim.dt
         .scene.contact_forces   :
            .scene.contact_forces.update_period = .sim.dt
         (.curriculum, , )   :
             .scene.terrain.terrain_generator   :
                .scene.terrain.terrain_generator.curriculum = 
            :
                 .scene.terrain.terrain_generator   :
                    .scene.terrain.terrain_generator.curriculum =

2.2.3 环境配置文件编写

复杂地面环境参数配置脚本

复杂地面环境配置文件 rough_env_cfg.py 的内容如下：

# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.managers import RewardTermCfg as RewTerm
from isaaclab.managers import SceneEntityCfg
from isaaclab.utils import configclass
import isaaclab_tasks.manager_based.locomotion.velocity.mdp as mdp
from isaaclab_tasks.manager_based.locomotion.velocity.op3_velocity_env_cfg import LocomotionVelocityRoughEnvCfg, RewardsCfg
from op3 import OP3_CFG  # 导入前面编写的机器人参数配置脚本
import random

@configclass
class OP3Rewards(RewardsCfg):
    """Reward terms for the MDP."""
    """机器人死亡时的惩罚"""
    termination_penalty = RewTerm(func=mdp.is_terminated, weight=-200.0)
    lin_vel_z_l2 = None
    """奖励机器人在机器人自身的朝向坐标系下跟踪期望的 xy 线速度"""
    track_lin_vel_xy_exp = RewTerm(
        func=mdp.track_lin_vel_xy_yaw_frame_exp,
        weight=1.0,
        params={"command_name": "base_velocity", "std": 0.5},
    )
    """奖励机器人在世界坐标系下跟踪期望的 z 轴角速度"""
    track_ang_vel_z_exp = RewTerm(
        func=mdp.track_ang_vel_z_world_exp,
        weight=1.0,
        params={"command_name": "base_velocity", "std": 0.5},
    )
    """奖励双足交替抬起 (步态),鼓励有步态的行走，在飞行测试时，令 weight=0 即可"""
    feet_air_time = RewTerm(
        func=mdp.feet_air_time_positive_biped,
        weight=,
        params={: , : SceneEntityCfg(, body_names=), : },
    )
    
    feet_slide = RewTerm(
        func=mdp.feet_slide,
        weight=-,
        params={: SceneEntityCfg(, body_names=), : SceneEntityCfg(, body_names=),},
    )
    
    
    dof_pos_limits = RewTerm(func=mdp.joint_pos_limits, weight=-, params={: SceneEntityCfg(, joint_names=)})
    
    joint_deviation_hip = RewTerm(
        func=mdp.joint_deviation_l1,
        weight=-,
        params={: SceneEntityCfg(, joint_names=[, , ])},
    )
    
    


 ():
    rewards: OP3Rewards = OP3Rewards()

     ():
        ().__post_init__()
        
        .scene.robot = OP3_CFG.replace(prim_path=)
         .scene.height_scanner:
            .scene.height_scanner.prim_path =   
        .events.push_robot = 
        .events.add_base_mass = 
        .events.reset_robot_joints.params[] = (, )
        .events.base_external_force_torque.params[].body_names = []  
        .events.reset_base.params = {
            : {: (-, ), : (-, ), : (-, )},
            : {: (, ), : (, ), : (, ), : (, ), : (, ), : (, )},
        }
        .terminations.base_contact.params[].body_names = []
        .rewards.undesired_contacts = 
        .rewards.flat_orientation_l2.weight = -
        .rewards.dof_torques_l2.weight = 
        .rewards.action_rate_l2.weight = -
        .rewards.dof_acc_l2.weight = -
        .commands.base_velocity.ranges.lin_vel_x = (, )  
        .commands.base_velocity.ranges.lin_vel_y = (, )  
        .commands.base_velocity.ranges.ang_vel_z = (-, )  
        .terminations.base_contact.params[].body_names = 


 ():
     ():
        ().__post_init__()
        .scene.num_envs = 
        .scene.env_spacing = 
        .episode_length_s = 
        .scene.terrain.max_init_terrain_level = 
         .scene.terrain.terrain_generator   :
            .scene.terrain.terrain_generator.num_rows = 
            .scene.terrain.terrain_generator.num_cols = 
            .scene.terrain.terrain_generator.curriculum = 
        
        .commands.base_velocity.ranges.lin_vel_x = (, )
        .commands.base_velocity.ranges.lin_vel_y = (, )
        .commands.base_velocity.ranges.ang_vel_z = (-, )
        .commands.base_velocity.ranges.heading = (, )
        .observations.policy.enable_corruption = 
        .events.base_external_force_torque = 
        .events.push_robot =

平整地面环境参数配置脚本

平地环境配置文件 flat_env_cfg.py 的内容如下：

# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.utils import configclass
from .rough_env_cfg import OP3RoughEnvCfg  # 一样的，将脚本中的'OP3'替换为自己的机器人名称，其他的不用改

@configclass
class OP3FlatEnvCfg(OP3RoughEnvCfg):
    def __post_init__(self):
        super().__post_init__()
        self.scene.terrain.terrain_type = "plane"
        self.scene.terrain.terrain_generator = None
        self.scene.height_scanner = None
        self.observations.policy.height_scan = None
        self.curriculum.terrain_levels = None
        self.rewards.feet_air_time.weight = 1.0
        self.rewards.feet_air_time.params["threshold"] = 0.6

class OP3FlatEnvCfg_PLAY(OP3FlatEnvCfg):
    def __post_init__(self) -> None:
        super().__post_init__()
        self.scene.num_envs = 1
        self.scene.env_spacing = 2.5
        self.observations.policy.enable_corruption = False
        self.events.base_external_force_torque = 
        .events.push_robot =

2.2.4 机器人训练环境注册

在 init.py 中注册机器人训练环境，内容如下，只需要把 OP3 替换为自己的机器人名称即可：

# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
import gymnasium as gym
from . import agents

# Register Gym environments.

""" 地形：崎岖地形 (rough terrain), 有地形生成器和难度课程。
用途：标准训练环境，适合训练机器人在复杂地形上行走。
奖励、终止、观测等：完整，适合正式训练。
"""
gym.register(
    id="Rough-OP3-train",
    entry_point="isaaclab.envs:ManagerBasedRLEnv",
    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": f"{__name__}.rough_env_cfg:OP3RoughEnvCfg",
        "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:OP3RoughPPORunnerCfg",
        "skrl_cfg_entry_point": f"{agents.__name__}:skrl_rough_ppo_cfg.yaml",
    },
)

""" 地形：崎岖地形，但用于'Play'模式。
区别：
环境数量：1，更适合测试和可视化。
地形课程关闭，地形数量减少，内存占用低。
随机扰动/推搡等事件关闭，更稳定。
观测扰动关闭，便于观察真实表现。
用途：用于演示、可视化、调试和模型评估。
"""
gym.register(
    id="Rough-OP3-Play",
    entry_point="isaaclab.envs:ManagerBasedRLEnv",
    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": f"{__name__}.rough_env_cfg:OP3RoughEnvCfg_PLAY",
        "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:OP3RoughPPORunnerCfg",
        "skrl_cfg_entry_point": f"{agents.__name__}:skrl_rough_ppo_cfg.yaml",
    },
)

""" 地形：平地 (flat terrain), 无地形生成器。
区别：
无地形难度课程，地形始终为平面。
无高度扫描观测，观测量减少。
奖励参数适配平地。
用途：适合在平地上训练，便于对比和基础能力训练。
"""
gym.register(
    id=,
    entry_point=,
    disable_env_checker=,
    kwargs={
        : ,
        : ,
        : ,
    },
)


gym.register(
    =,
    entry_point=,
    disable_env_checker=,
    kwargs={
        : ,
        : ,
        : ,
    },
)

2.2.5 强化学习训练参数脚本修改

rsl_rl_ppo_cfg.py:

# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.utils import configclass
from isaaclab_rl.rsl_rl import RslRlOnPolicyRunnerCfg, RslRlPpoActorCriticCfg, RslRlPpoAlgorithmCfg

@configclass
class OP3RoughPPORunnerCfg(RslRlOnPolicyRunnerCfg):
    # 将'OP3'替换为自己的机器人名称即可，其他参数可不改
    num_steps_per_env = 24
    max_iterations = 3000
    save_interval = 50
    experiment_name = "OP3_rough"
    empirical_normalization = False
    policy = RslRlPpoActorCriticCfg(
        init_noise_std=1.0,
        actor_hidden_dims=[512, 256, 128],
        critic_hidden_dims=[512, 256, 128],
        activation="elu",
    )
    algorithm = RslRlPpoAlgorithmCfg(
        value_loss_coef=1.0,
        use_clipped_value_loss=True,
        clip_param=0.2,
        entropy_coef=0.01,
        num_learning_epochs=5,
        num_mini_batches=4,
        learning_rate=1.0e-3,
        schedule="adaptive",
        gamma=0.99,
        lam=0.95,
        desired_kl=0.01,
        max_grad_norm=1.0,
    )

@configclass
class OP3FlatPPORunnerCfg(OP3RoughPPORunnerCfg):
    def ():
        ().__post_init__()
        .max_iterations = 
        .experiment_name = 
        .policy.actor_hidden_dims = [, , ]
        .policy.critic_hidden_dims = [, , ]

skrl_flat_ppo_cfg.yaml 和 skrl_rough_ppo_cfg.yaml: skrl_flat_ppo_cfg.yaml 和 skrl_rough_ppo_cfg.yaml 的内容中，只需要把 OP3 替换为自己的机器人名称即可，其他参数可不改。

skrl_flat_ppo_cfg.yaml 的内容如下：

seed: 42
models:
  separate: False
policy:
  class: GaussianMixin
  clip_actions: False
  clip_log_std: True
  min_log_std: -20.0
  max_log_std: 2.0
  initial_log_std: 0.0
  network:
    - name: net
      input: STATES
      layers: [128, 128, 128]
      activations: elu
      output: ACTIONS
value:
  class: DeterministicMixin
  clip_actions: False
  network:
    - name: net
      input: STATES
      layers: [128, 128, 128]
      activations: elu
      output: ONE
memory:
  class: RandomMemory
  memory_size: -1
agent:

skrl_rough_ppo_cfg.yaml 的内容如下：

seed: 42
models:
  separate: False
policy:
  class: GaussianMixin
  clip_actions: False
  clip_log_std: True
  min_log_std: -20.0
  max_log_std: 2.0
  initial_log_std: 0.0
  network:
    - name: net
      input: STATES
      layers: [512, 256, 128]
      activations: elu
      output: ACTIONS
value:
  class: DeterministicMixin
  clip_actions: False
  network:
    - name: net
      input: STATES
      layers: [512, 256, 128]
      activations: elu
      output: ONE
memory:
  class: RandomMemory
  memory_size: -1
agent:

3. 机器人训练与测试

3.1 机器人训练

在完成前面的注册机器人训练环境的所有操作后，进入 isaaclab 目录，在终端执行训练语句：

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Flat-OP3-train --max_iterations 3000 --seed 42 --headless

这里的训练框架为 rsl-rl，你也可以使用 isaaclab/scripts/reinforcement_learning/... 路径下的其他训练框架，显然这里的训练环境就是前面注册的 Flat-OP3-train，你也可以使用复杂地面环境 Rough-OP3-train。

此外，关于训练时填入的参数说明可以通过下面语句获取：

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Flat-OP3-train -h

3.2 机器人测试

在完成训练后，你会发现在 ~/isaaclab/logs/rsl_rl/OP3_flat 路径下有很多 .pt 文件，这些文件就是训练过程中保存的强化学习网络模型参数文件，你可以使用这些参数文件来测试机器人的行走能力。

比如，我前面训练了 3000 轮，那我就选取最后一次获取的 .pt 文件 ~/model_2999.pt，在终端执行下面的测试语句：

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task Flat-OP3-Play --checkpoint /home/suiyi/isaacsim/isaaclab/logs/rsl_rl/OP3_flat/2025-06-05_19-31-40/model_2999.pt

从零开始使用 IsaacLab 训练机器人行走

IsaacLab 入门教程

1. 环境配置

1.1 推荐配置

1.2 Ubuntu 22.04 LTS 安装

1.3 安装 NVIDIA 驱动

1.4 安装 CUDA, cuDNN

1.4.1 安装 CUDA

1.4.2 安装 cuDNN

1.5 安装 PyTorch

1.6 安装 Anaconda

1.7 安装 Isaac Sim

1.8 资产配置

1.9 安装 Isaac Lab

2. 机器人注册

2.1 机器人 USD 文件获取

2.2 机器人注册

2.2.1 机器人参数文件编写

2.2.2 速度环境配置文件编写

2.2.3 环境配置文件编写

2.2.4 机器人训练环境注册

2.2.5 强化学习训练参数脚本修改

3. 机器人训练与测试

3.1 机器人训练

3.2 机器人测试

更多推荐文章

相关免费在线工具

从零开始使用 IsaacLab 训练机器人行走

IsaacLab 入门教程

1. 环境配置

1.1 推荐配置

1.2 Ubuntu 22.04 LTS 安装

1.3 安装 NVIDIA 驱动

1.4 安装 CUDA, cuDNN

1.4.1 安装 CUDA

1.4.2 安装 cuDNN

1.5 安装 PyTorch

1.6 安装 Anaconda

1.7 安装 Isaac Sim

1.8 资产配置

1.9 安装 Isaac Lab

2. 机器人注册

2.1 机器人 USD 文件获取

2.2 机器人注册

2.2.1 机器人参数文件编写

2.2.2 速度环境配置文件编写

2.2.3 环境配置文件编写

2.2.4 机器人训练环境注册

2.2.5 强化学习训练参数脚本修改

3. 机器人训练与测试

3.1 机器人训练

3.2 机器人测试

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具