查看显存占用命令:
nvidia-smi --query-compute-apps=pid,process_name,used_memory --format=csv
分布式训练,指定gpu训练命令
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=12098 --nproc_per_node=4 train.py --config config/config_cpgnet_sgd_bili_sample_ohem_fp16.py
验证:CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=12098 --nproc_per_node=4 evaluate.py --config config/config_cpgnet_sgd_bili_sample_ohem_fp16.py --start_epoch 0 --end_epoch 47
配置文件中的范围修改为实际的大小:
class Voxel:
RV_theta = (-40.0, 20.0)
range_x = (-50.0, 50.0)
range_y = (-50.0, 50.0)
range_z = (-3.0, 5.0)
bev_shape = (600, 600, 30) rv_shape = (64, 2048)修改为:
class Voxel:
RV_theta = (-10.0, 60.0)
range_x = (-20.0, 50.0)
range_y = (-20.0, 20.0)
range_z = (-2.0, 3.0)
bev_shape = (600, 600, 30) rv_shape = (64, 2048)