PREDICTIVE THERMAL MANAGEMENT IN EMBEDDED ELECTRONICS USING DEEP REINFORCEMENT LEARNING

Oleh Yatskiv; Bohdan Koman

doi:10.30970/eli.33.11

PREDICTIVE THERMAL MANAGEMENT IN EMBEDDED ELECTRONICS USING DEEP REINFORCEMENT LEARNING

Oleh Yatskiv, Bohdan Koman

Abstract

Background. This paper presents a deep reinforcement learning approach for intelligent thermal management in embedded electronics, targeting energy-efficient and safe operation under dynamic workloads. A custom hardware switching circuit based on an NPN transistor was designed to enable GPIO-driven fan actuation on a resource-constrained platform.

Materials and Methods. A real-time dataset was collected from a Raspberry Pi Zero W, capturing CPU temperature, usage metrics, and fan states over a 12-hour controlled experiment. The thermal regulation task was modeled as a Markov Decision Process, and a Deep Q-Network (DQN) was trained to learn optimal fan activation policies. The trained model was deployed directly on-device, interfaced with a custom GPIO-controlled fan circuit. Inference was performed in less than one millisecond per decision step using a lightweight PyTorch runtime.

Results and Discussion. Evaluation results show that the DQN policy reduced total fan activation time by 23.2% compared to the rule-based hysteresis baseline, while maintaining CPU temperature below 60°C for over 99% of the test duration. The trained agent activated the fan only 23.7% of the time, demonstrating a conservative and energy-aware cooling strategy. Confusion matrix analysis yielded a precision of 1.000, a recall of 1.000, and an F1-score of 1.000 across 3442 model-controlled evaluation steps. The model correctly identified all 22 fan activation events without any false positives or false negatives. Comparative analysis against nine recent AI-driven approaches showed that the proposed method achieved an 11.2°C temperature reduction and 36.5% energy savings, while operating entirely on-device without cloud dependence.

Conclusion. The model exhibited stable reward convergence, accurate action prediction, and anticipatory control that minimized overheating events. Thermal traces confirmed smooth transitions and low variance, demonstrating the feasibility of deploying learning-based thermal policies in real-time edge environments. This work contributes a practical framework for energy-aware cooling and provides a pathway for adaptive thermal intelligence in low-resource embedded systems.

Keywords: thermal management, deep reinforcement learning, embedded systems, Deep Q-Network, energy-efficient cooling, real-time inference.

Full Text:

PDF

References

[1] El Gharbi, M., Abounasr, J., García, R. F., & Gali, I. G. (2024). Textile stretchable antenna-based sensor for breathing monitoring. IEEE Sensors Journal. https://doi.org/10.1109/JSEN.2024.3485472.

[2] Aghaei, M., Fairbrother, A., Gok, A., Ahmad, S., Kazim, S., Lobato, K., Oreski, G., Reinders, A., Schmitz, J., Theelen, M., et al. (2022). Review of degradation and failure phenomena in photovoltaic modules. Renewable and Sustainable Energy Reviews, 159, 112160. https://doi.org/10.1016/j.rser.2022.112160.

[3] Kanellopoulos, D., & Sharma, V. K. (2022). Dynamic load balancing techniques in the IoT: A review. Symmetry, 14(12), 2554. https://doi.org/10.3390/sym14122554.

[4] Berouine, A., Ouladsine, R., Bakhouya, M., & Essaaidi, M. (2022). A predictive control approach for thermal energy management in buildings. Energy Reports, 8, 9127–9141. https://doi.org/10.1016/j.egyr.2022.07.037.

[5] Cao, K., Li, Z., Luo, H., Jiang, Y., Liu, H., Xu, L., Gao, P., & Liu, H. (2024). Comprehensive review and future prospects of multi-level fan control strategies in data centers for joint optimization of thermal management systems. Journal of Building Engineering, 110021. https://doi.org/10.1016/j.jobe.2024.110021

[6] Ahmad, I. (2023). Advances in machine learning for monitoring, control, and optimization of temperature of reactors. https://doi.org/10.20944/preprints202309.1318.v1.

[7] Smith, J. B., & Adams, J. A. (2024). Workload estimation for unknown tasks: A survey of machine learning under distribution shift. arXiv preprint arXiv:2403.13318. https://doi.org/10.48550/arXiv.2403.13318.

[8] Canepa, A. (2023). Application-aware optimization of artificial intelligence for deployment on resource constrained devices. Universita degli studi di Genova. https://doi.org/10.1109/TNSM.2026.3666676.

[9] Garg, N. (2024). Neuromorphic in-memory learning with analog integrated circuits and nanoscale memristive devices [Doctoral dissertation, Universite de Lille; Universite de Sherbrooke, Quebec, Canada]. https://hal.science/tel-04821563/.

[10] Shaheen, K., Hanif, M. A., Hasan, O., & Shafique, M. (2022). Continual learning for real-world autonomous systems: Algorithms, challenges and frameworks. Journal of Intelligent & Robotic Systems, 105(1), 9. https://doi.org/10.48550/arXiv.2105.12374.

[11] Iranfar, A., Terraneo, F., Csordas, G., Zapater, M., Fornaciari, W., & Atienza, D. (2020). Dynamic thermal management with proactive fan speed control through reinforcement learning. In Proc. Design, Automation & Test in Europe (DATE) (pp. 418–423). IEEE. https://api.semanticscholar.org/CorpusID:208524145.

[12] Lin, W., Lin, W., Lin, J., Zhong, H., Wang, J., & He, L. (2024). A multi-agent reinforcement learning-based method for server energy efficiency optimization combining DVFS and dynamic fan control. Sustainable Computing: Informatics and Systems, 42, 100977. https://doi.org/10.1016/j.suscom.2024.100977.

[13] Maity, S., Majumder, A., & Dey, S. (2024). Harnessing machine learning in dynamic thermal management in embedded CPU-GPU platforms. ACM Transactions on Design Automation of Electronic Systems, 27(6), 1–26. https://dl.acm.org/doi/10.1145/3708890.

[14] Kim, S., Bin, K., Ha, S., Lee, K., & Chong, S. (2021). zTT: Learning-based DVFS with zero thermal throttling for mobile devices. In Proc. 19th ACM Int. Conf. on Mobile Systems, Applications, and Services (MobiSys) (pp. 41–53). https://dl.acm.org/doi/10.1145/3458864.3468161.

[15] Zhou, Y., Liang, F., Chin, T.-W., & Marculescu, D. (2022). Play it cool: Dynamic shifting prevents thermal throttling. In Proc. ICML Workshop on Dynamic Neural Networks (DyNN). https://doi.org/10.48550/arXiv.2206.10849.

[16] Tan, T., & Cao, G. (2024). Thermal-aware scheduling for deep learning on mobile devices with NPU. IEEE Transactions on Mobile Computing, 23(12), 10706–10719. https://doi.org/10.1109/TMC.2024.3379501.

[17] Yatskiv, O., & Koman, B. (2025). Assessing the potential of artificial intelligence and machine learning for thermal management in electronic devices. Technology Audit and Production Reserves, 1(81). https://doi.org/10.15587/2706-5448.2025.323117.

[18] Mohammadi, M., & Beitollahi, H. (2022). Q-scheduler: A temperature and energy-aware deep Q-learning technique to schedule tasks in real-time multiprocessor embedded systems. IET Computers & Digital Techniques, 16(4), 125–140. https://doi.org/10.1049/cdt2.12044.

[19] Yeganeh-Khaksar, A., Ansari, M., Safari, S., Yari-Karin, S., & Ejlali, A. (2020). Ring-DVFS: Reliability-aware reinforcement learning-based DVFS for real-time embedded systems. IEEE Transactions on Parallel and Distributed Systems, 31(3), 623–633. https://doi.org/10.1109/LES.2020.3033187.

[20] Akhsham, M., Dousti, M. J., & Safari, S. (2025). Neural network-based control of forced-convection and thermoelectric coolers. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 44(2), 582–591. https://doi.org/10.1109/TCAD.2024.3438689

[21] Tang, J., & Hong, J. (2025). Reinforcement learning-driven task migration for effective temperature management in 3D NoC systems. Scientific Reports, 15, 11933. https://doi.org/10.1038/s41598-025-96335-6.

[22] Kumar, R., & Ghoshal, B. (2022). Machine learning guided thermal management of OpenCL applications on CPU-GPU based embedded platforms. IET Computers & Digital Techniques, 16(6), 308–317. https://doi.org/10.1049/cdt2.12050.

[23] Li, J., et al. (2024). FiDRL: Flexible invocation-based deep reinforcement learning for DVFS scheduling in embedded systems. IEEE Transactions on Computers. https://doi.org/10.1109/TC.2024.3465933.

[24] Liu, H., Yu, J., & Wang, R. (2022). Model predictive control of portable electronic devices under skin temperature constraints. Energy, 260, 125185. https://doi.org/10.1016/j.energy.2022.125185.

[25] Afaq, M., Jebelli, A., & Ahmad, R. (2023). An intelligent thermal management fuzzy logic control system design and analysis using ANSYS Fluent for a mobile robotic platform in extreme weather applications. Journal of Intelligent & Robotic Systems, 107(11). https://doi.org/10.1007/s10846-022-01799-7.

[26] Dzundza, B., Kohut, I., Holota, V., Turovska, L., & Deichakivskyi, M. (2022). Principles of construction of hybrid microsystems for biomedical applications. Physics and Chemistry of Solid State, 23(4). https://doi.org/10.15330/pcss.23.4.776-784.

[27] Ahmadi, A. (2024). EdgeEngine: A thermal-aware optimization framework for edge inference [Doctoral dissertation, University of British Columbia]. https://doi.org/10.1145/3583740.3626616.

[28] Zhang, Z., Zhao, Y., Li, H., Lin, C., & Liu, J. (2024). DVFO: Learning-based DVFS for energy-efficient edge-cloud collaborative inference. arXiv preprint arXiv:2306.01811. https://doi.org/10.48550/arXiv.2306.01811.

[29] Zhang, Q., Zeng, W., Lin, Q., Chng, C.-B., Chui, C.-K., & Lee, P.-S. (2023). Deep reinforcement learning towards real-world dynamic thermal management of data centers. Applied Energy, 333, 120561. https://doi.org/10.1016/j.apenergy.2022.120561.

DOI: http://dx.doi.org/10.30970/eli.33.11

Refbacks

There are currently no refbacks.

Username
Password
Remember me

Electronics and information technologies / Електроніка та інформаційні технології

PREDICTIVE THERMAL MANAGEMENT IN EMBEDDED ELECTRONICS USING DEEP REINFORCEMENT LEARNING

Abstract

Full Text:

References

Refbacks