Proceedings of International Conference on Applied Innovation in IT
2025/08/29, Volume 13, Issue 4, pp.185-192
Review of Advanced Data Structures for Streaming Data Handling in Machine Learning Under Concept Drift
Mithradevi Kumar, Aswathi KP, Janani Venugopal and Farah Ahmed Abdulrahman Abstract: Machine training on real-time flows introduces a significant hurdle, notably when a pattern change arises, a shift where data makeup or the link between input and output alters progressively. Innovative data formats and intelligent logic are essential for managing the size, pace, and variability inherent in flow-based inputs, and for adapting effectively to these shifts. This review examines novel and enhanced data models designed for stream handling in learning engines, with a key focus on Forgetful Trees and EdgeFrame. Forgetful Trees are rule-driven systems designed to manage shifts in trends by likely discarding prior entries and adjusting values as output rates change. This design targets rapid updates and strong output. EdgeFrame is a low-memory data holder tailored for industry-grade machine tasks with shifting, matrix-rich collections. It offers tools such as light replicas, write-aware copies, and memory-optimized storage to boost usage and reduce resource needs. From these points, it’s clear that Forgetful Trees mainly tackle model reshaping under drift, while EdgeFrame targets better core data handling. There's space for a fused method that combines the benefits of both to shape steadier and sharper innovative systems for data in motion. Other useful data structures discussed include RSBF, Hash Trees, RLR-Tree, and BART.
Keywords: Concept Drift, Streaming Data, Machine Learning, Forgetful Forests, EdgeFrame, Data Structures.
DOI: 10.25673/122112
Download: PDF
References:
- J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Comput. Surv. (CSUR), vol. 46, no. 4, Art. no. 44, 2014, doi: 10.1145/2523813.
- P. Domingos and G. Hulten, “Mining high-speed data streams,” in Proc. 6th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2000, pp. 71–80, doi: 10.1145/347090.347107.
- H. M. Gomes et al., “Adaptive random forests for evolving data stream classification,” Mach. Learn., vol. 106, no. 9–10, pp. 1469–1495, 2017, doi: 10.1007/s10994-017-5642-8.
- A. Bifet and R. Gavaldà, “Learning from time-changing data with adaptive windowing,” in Proc. SIAM Int. Conf. Data Mining, 2007.
- M. Kumar, K. P. Aswathi, and V. Janani, “Forgetful forests: Adaptive tree-based learning for streaming data under concept drift,” Int. J. Streaming Data Anal., vol. 9, no. 2, pp. 101–118, 2024.
- M. Zaharia et al., “Apache Arrow: A cross-language development platform for in-memory data,” Proc. VLDB Endowment, vol. 10, no. 12, pp. 1986–1989, 2016.
- Y. Zhang and D. Wang, “CAMAL: A latency-sensitive LSM-tree structure for streaming data,” Int. J. Comput. Sci. Issues, vol. 8, no. 5, Art. no. 1, 2011.
- J. R. Cano et al., “An incremental learning method based on probabilistic neural networks and principal component analysis for classification of data streams,” Data Knowl. Eng., vol. 68, no. 9, pp. 1021–1036, 2009.
- H. A. Chipman, E. I. George, and R. E. McCulloch, “BART: Bayesian additive regression trees,” Ann. Appl. Stat., vol. 4, no. 1, pp. 266–298, 2010.
- G. Hulten, L. Spencer, and P. Domingos, “Mining time-changing data streams,” in Proc. ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2001, pp. 97–106.
- R. C. Merkle, “A certified digital signature,” in Advances in Cryptology (CRYPTO’89), Berlin, Germany: Springer, 1989, pp. 218–238.
- S. Tarkoma, C. E. Rothenberg, and E. Lagerspetz, “Theory and practice of Bloom filters for distributed systems,” IEEE Commun. Surveys Tuts., vol. 14, no. 1, pp. 131–155, 2012.
- B. Krawczyk et al., “Ensemble learning for data stream analysis: A survey,” Inf. Fusion, vol. 37, pp. 132–156, 2017.
- I. Žliobaitė, “Learning under concept drift: An overview,” arXiv preprint arXiv:1010.4784, 2010, doi: 10.48550/arXiv.1010.4784.
- J. Gama, P. Medas, G. Castillo, and P. Rodrigues, “Learning with drift detection,” in Proc. Brazilian Symp. Artificial Intelligence (SBIA), 2004.
- V. Losing, B. Hammer, and H. Wersing, “Incremental online learning: A review and comparison,” Neurocomputing, vol. 275, pp. 1261–1274, 2018.
- J. Read, A. Bifet, B. Pfahringer, and G. Holmes, “Batch-incremental versus instance-incremental learning in dynamic data,” in Proc. ECML PKDD, 2012, pp. 313–323.
- S. Wang et al., “Energy-efficient learning using VFDT-nmin,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 7, pp. 1991–2002, 2019.
- B. Krawczyk, “Learning from imbalanced data: Open challenges,” Prog. Artif. Intell., vol. 5, no. 4, pp. 221–232, 2016.
- L. L. Minku and X. Yao, “DDD: Diversity for dealing with concept drift,” IEEE Trans. Knowl. Data Eng., vol. 24, no. 11, pp. 1904–1917, 2012.
- Y. Chen et al., “RSBF: Reservoir sampling-based Bloom filter,” in Proc. IEEE INFOCOM, 2020.
- J. Jiang et al., “BSBF: Biased sampling Bloom filters,” IEEE Trans. Netw. Serv. Manag., vol. 16, no. 3, pp. 1229–1242, 2019.
- G. Zyskind, O. Nathan, and A. Pentland, “Decentralizing privacy: Using blockchain,” in Proc. IEEE Security and Privacy Workshops, 2015.
- A. Narayanan et al., Bitcoin and Cryptocurrency Technologies. Princeton, NJ, USA: Princeton Univ. Press, 2016.
- L. Lamport, “The Byzantine generals problem,” ACM Trans. Program. Lang. Syst. (TOPLAS), vol. 4, no. 3, pp. 382–401, 1981.
- C. Gentry, “A fully homomorphic encryption scheme,” Ph.D. dissertation, Stanford Univ., Stanford, CA, USA, 2009.
- M. A. Akbar et al., “RLR-Tree: Reinforcement learning for spatial indexing,” in Proc. ACM SIGSPATIAL Int. Conf. Advances in Geographic Information Systems, 2019.
- M. Satyanarayanan, “The emergence of edge computing,” Computer, vol. 50, no. 1, pp. 30–39, 2017.
- J. Gama and P. Kosina, “Recurrent concepts in data streams,” Knowl. Inf. Syst., vol. 40, no. 3, pp. 489–507, 2014.
|

HOME

- Conference
- Journal
- Paper Submission to Journal
- Paper Submission to Conference
- For Authors
- For Reviewers
- Important Dates
- Conference Committee
- Editorial Board
- Reviewers
- Last Proceedings

PROCEEDINGS
-
Volume 13, Issue 4 (ICAIIT 2025)
-
Volume 13, Issue 3 (ICAIIT 2025)
-
Volume 13, Issue 2 (ICAIIT 2025)
-
Volume 13, Issue 1 (ICAIIT 2025)
-
Volume 12, Issue 2 (ICAIIT 2024)
-
Volume 12, Issue 1 (ICAIIT 2024)
-
Volume 11, Issue 2 (ICAIIT 2023)
-
Volume 11, Issue 1 (ICAIIT 2023)
-
Volume 10, Issue 1 (ICAIIT 2022)
-
Volume 9, Issue 1 (ICAIIT 2021)
-
Volume 8, Issue 1 (ICAIIT 2020)
-
Volume 7, Issue 1 (ICAIIT 2019)
-
Volume 7, Issue 2 (ICAIIT 2019)
-
Volume 6, Issue 1 (ICAIIT 2018)
-
Volume 5, Issue 1 (ICAIIT 2017)
-
Volume 4, Issue 1 (ICAIIT 2016)
-
Volume 3, Issue 1 (ICAIIT 2015)
-
Volume 2, Issue 1 (ICAIIT 2014)
-
Volume 1, Issue 1 (ICAIIT 2013)

PAST CONFERENCES
ICAIIT 2025
-
Photos
-
Reports
ICAIIT 2024
-
Photos
-
Reports
ICAIIT 2023
-
Photos
-
Reports
ICAIIT 2021
-
Photos
-
Reports
ICAIIT 2020
-
Photos
-
Reports
ICAIIT 2019
-
Photos
-
Reports
ICAIIT 2018
-
Photos
-
Reports
ETHICS IN PUBLICATIONS
ACCOMODATION
CONTACT US
|
|