基于近端策略优化机会窗口的多功能雷达任务调度算法

赵子贺; 冉华明; 吴元

引用本文:	[点击复制]
	[点击复制]

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 233次下载 0次
分享到：微信更多字体:加大+\|默认\|缩小-
基于近端策略优化机会窗口的多功能雷达任务调度算法
赵子贺, 冉华明, 吴元
西南电子技术研究所

摘要:

现代战场复杂电磁环境下，多功能雷达需同步处理跟踪、搜索等异构任务，传统调度算法因缺乏弹性时间设计，普遍存在高优先级任务时效性与低优先级任务覆盖率失衡的问题。时间片算法固定时隙无弹性，易致时隙浪费或任务阻塞；时间窗算法窗口刚性，难以预留空闲时间，低优先级任务调度率较差。针对此，本文提出以优化弹性时间为核心的近端策略优化（PPO）机会窗口调度算法：首先构建弹性可调节任务模型，突破传统固定时间约束，允许任务实际开始时间灵活适配资源状态；进一步设计弹性时间驱动的启发策略，以松弛时间最大化为目标，在保障高优先级任务完成的同时，为低优先级任务创造插入机会；最终结合 PPO 算法优化机会窗口大小，将任务优先级、调度周期内的任务数量以及任务持续时间分布纳入状态空间，通过单次调度周期调度任务成功率的单步回报与全时段任务完成率的终局回报联动，实现弹性时间与窗口大小的协同优化。该算法通过弹性时间的核心设计，有效解决了传统调度中资源刚性分配的痛点，为多功能雷达过饱和任务场景下的优先级平衡与资源高效利用提供技术支撑。

关键词: 多功能雷达调度算法近端策略优化机会窗口任务调度率

DOI：

分类号:TP181；TN95

基金项目:基于类脑智能的多源视频融合感知技术,国家自然科学基金项目（面上项目，重点项目，重大项目）

A Multi-Function Radar Task Scheduling Algorithm Based on Proximal Policy Optimization with Opportunity Window

赵子贺, 冉华明, 吴元

Abstract:

Under the complex electromagnetic environment of modern battlefields, multifunctional radars need to simultaneously process heterogeneous tasks such as tracking and searching. Due to the lack of elastic time design, traditional scheduling algorithms generally suffer from an imbalance between the timeliness of high-priority tasks and the coverage rate of low-priority tasks. Time-slice algorithms have fixed time slots without elasticity, which easily lead to time-slot waste or task blocking; time-window algorithms have rigid windows, making it difficult to reserve idle time and resulting in poor scheduling rates for low-priority tasks. To address this issue, this paper proposes a Proximal Policy Optimization (PPO) opportunity window scheduling algorithm with the optimization of elastic time as the core: first, an elastically adjustable task model is constructed to break through the constraints of traditional fixed time, allowing the actual start time of tasks to flexibly adapt to resource status; further, a heuristic strategy driven by elastic time is designed, with the goal of maximizing slack time, which ensures the completion of high-priority tasks while creating insertion opportunities for low-priority tasks; finally, the PPO algorithm is integrated to optimize the size of the opportunity window, incorporating task priorities, the number of tasks within the scheduling cycle, and task duration distribution into the state space. Through the linkage between the single-step reward based on the task scheduling success rate in a single scheduling cycle and the final reward based on the full-time task completion rate, the coordinated optimization of elastic time and window size is realized. Through the core design of elastic time, this algorithm effectively addresses the pain point of rigid resource allocation in traditional scheduling and provides technical support for priority balance and efficient resource utilization of multifunctional radars in oversaturated task scenarios.

Key words: multifunctional radar scheduling algorithm Proximal Policy Optimization (PPO) opportunity window task scheduling rate