• Bilevel Multi‑Armed Bandit framework for underwater acoustic communications in dynamic environments. • Inner CD‑MAB jointly optimizes modulation and power using channel state and Age of Information. • Outer Feedback Scheduling MAB adjusts feedback interval based on throughput stability and drops. • Adaptive mechanism reduces feedback overhead while remaining responsive to channel dynamics. • Computationally efficient design fits resource‑constrained UWA nodes and low‑power devices. • Simulations show up to 20.6% throughput gain and 36.6% energy savings versus DRL baselines. • Results validated using DESERT Underwater Network Simulator, confirming practical gains.

Article Summaries:

  • Researchers Andrea Panebianco and colleagues propose a two‑level multi‑armed bandit (MAB) framework to improve underwater acoustic (UWA) communications, where bandwidth is scarce and delays are long. The inner level, a Contextual Delayed MAB (CD‑MAB), jointly selects adaptive modulation and transmission power using both channel‑state feedback and its Age of Information (AoI) to maximize throughput. The outer level, a Feedback Scheduling MAB, adjusts the interval of channel‑state updates: stable throughput permits longer intervals, while drops trigger more frequent feedback, reducing overhead. Simulations with the DESERT UWA simulator show up to 20.6 % throughput gain and 36.6 % energy savings versus existing deep‑reinforcement‑learning baselines.

Sources: