10.25673/122112">


Proceedings of International Conference on Applied Innovation in IT
2025/08/29, Volume 13, Issue 4, pp.185-192

Review of Advanced Data Structures for Streaming Data Handling in Machine Learning Under Concept Drift


Mithradevi Kumar, Aswathi KP, Janani Venugopal and Farah Ahmed Abdulrahman


Abstract: Machine training on real-time flows introduces a significant hurdle, notably when a pattern change arises, a shift where data makeup or the link between input and output alters progressively. Innovative data formats and intelligent logic are essential for managing the size, pace, and variability inherent in flow-based inputs, and for adapting effectively to these shifts. This review examines novel and enhanced data models designed for stream handling in learning engines, with a key focus on Forgetful Trees and EdgeFrame. Forgetful Trees are rule-driven systems designed to manage shifts in trends by likely discarding prior entries and adjusting values as output rates change. This design targets rapid updates and strong output. EdgeFrame is a low-memory data holder tailored for industry-grade machine tasks with shifting, matrix-rich collections. It offers tools such as light replicas, write-aware copies, and memory-optimized storage to boost usage and reduce resource needs. From these points, it’s clear that Forgetful Trees mainly tackle model reshaping under drift, while EdgeFrame targets better core data handling. There's space for a fused method that combines the benefits of both to shape steadier and sharper innovative systems for data in motion. Other useful data structures discussed include RSBF, Hash Trees, RLR-Tree, and BART.

Keywords: Concept Drift, Streaming Data, Machine Learning, Forgetful Forests, EdgeFrame, Data Structures.

DOI: 10.25673/122112

Download: PDF

References:

  1. J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Comput. Surv. (CSUR), vol. 46, no. 4, Art. no. 44, 2014, doi: 10.1145/2523813.
  2. P. Domingos and G. Hulten, “Mining high-speed data streams,” in Proc. 6th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2000, pp. 71–80, doi: 10.1145/347090.347107.
  3. H. M. Gomes et al., “Adaptive random forests for evolving data stream classification,” Mach. Learn., vol. 106, no. 9–10, pp. 1469–1495, 2017, doi: 10.1007/s10994-017-5642-8.
  4. A. Bifet and R. Gavaldà, “Learning from time-changing data with adaptive windowing,” in Proc. SIAM Int. Conf. Data Mining, 2007.
  5. M. Kumar, K. P. Aswathi, and V. Janani, “Forgetful forests: Adaptive tree-based learning for streaming data under concept drift,” Int. J. Streaming Data Anal., vol. 9, no. 2, pp. 101–118, 2024.
  6. M. Zaharia et al., “Apache Arrow: A cross-language development platform for in-memory data,” Proc. VLDB Endowment, vol. 10, no. 12, pp. 1986–1989, 2016.
  7. Y. Zhang and D. Wang, “CAMAL: A latency-sensitive LSM-tree structure for streaming data,” Int. J. Comput. Sci. Issues, vol. 8, no. 5, Art. no. 1, 2011.
  8. J. R. Cano et al., “An incremental learning method based on probabilistic neural networks and principal component analysis for classification of data streams,” Data Knowl. Eng., vol. 68, no. 9, pp. 1021–1036, 2009.
  9. H. A. Chipman, E. I. George, and R. E. McCulloch, “BART: Bayesian additive regression trees,” Ann. Appl. Stat., vol. 4, no. 1, pp. 266–298, 2010.
  10. G. Hulten, L. Spencer, and P. Domingos, “Mining time-changing data streams,” in Proc. ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2001, pp. 97–106.
  11. R. C. Merkle, “A certified digital signature,” in Advances in Cryptology (CRYPTO’89), Berlin, Germany: Springer, 1989, pp. 218–238.
  12. S. Tarkoma, C. E. Rothenberg, and E. Lagerspetz, “Theory and practice of Bloom filters for distributed systems,” IEEE Commun. Surveys Tuts., vol. 14, no. 1, pp. 131–155, 2012.
  13. B. Krawczyk et al., “Ensemble learning for data stream analysis: A survey,” Inf. Fusion, vol. 37, pp. 132–156, 2017.
  14. I. Žliobaitė, “Learning under concept drift: An overview,” arXiv preprint arXiv:1010.4784, 2010, doi: 10.48550/arXiv.1010.4784.
  15. J. Gama, P. Medas, G. Castillo, and P. Rodrigues, “Learning with drift detection,” in Proc. Brazilian Symp. Artificial Intelligence (SBIA), 2004.
  16. V. Losing, B. Hammer, and H. Wersing, “Incremental online learning: A review and comparison,” Neurocomputing, vol. 275, pp. 1261–1274, 2018.
  17. J. Read, A. Bifet, B. Pfahringer, and G. Holmes, “Batch-incremental versus instance-incremental learning in dynamic data,” in Proc. ECML PKDD, 2012, pp. 313–323.
  18. S. Wang et al., “Energy-efficient learning using VFDT-nmin,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 7, pp. 1991–2002, 2019.
  19. B. Krawczyk, “Learning from imbalanced data: Open challenges,” Prog. Artif. Intell., vol. 5, no. 4, pp. 221–232, 2016.
  20. L. L. Minku and X. Yao, “DDD: Diversity for dealing with concept drift,” IEEE Trans. Knowl. Data Eng., vol. 24, no. 11, pp. 1904–1917, 2012.
  21. Y. Chen et al., “RSBF: Reservoir sampling-based Bloom filter,” in Proc. IEEE INFOCOM, 2020.
  22. J. Jiang et al., “BSBF: Biased sampling Bloom filters,” IEEE Trans. Netw. Serv. Manag., vol. 16, no. 3, pp. 1229–1242, 2019.
  23. G. Zyskind, O. Nathan, and A. Pentland, “Decentralizing privacy: Using blockchain,” in Proc. IEEE Security and Privacy Workshops, 2015.
  24. A. Narayanan et al., Bitcoin and Cryptocurrency Technologies. Princeton, NJ, USA: Princeton Univ. Press, 2016.
  25. L. Lamport, “The Byzantine generals problem,” ACM Trans. Program. Lang. Syst. (TOPLAS), vol. 4, no. 3, pp. 382–401, 1981.
  26. C. Gentry, “A fully homomorphic encryption scheme,” Ph.D. dissertation, Stanford Univ., Stanford, CA, USA, 2009.
  27. M. A. Akbar et al., “RLR-Tree: Reinforcement learning for spatial indexing,” in Proc. ACM SIGSPATIAL Int. Conf. Advances in Geographic Information Systems, 2019.
  28. M. Satyanarayanan, “The emergence of edge computing,” Computer, vol. 50, no. 1, pp. 30–39, 2017.
  29. J. Gama and P. Kosina, “Recurrent concepts in data streams,” Knowl. Inf. Syst., vol. 40, no. 3, pp. 489–507, 2014.


    HOME

       - Conference
       - Journal
       - Paper Submission to Journal
       - Paper Submission to Conference
       - For Authors
       - For Reviewers
       - Important Dates
       - Conference Committee
       - Editorial Board
       - Reviewers
       - Last Proceedings


    PROCEEDINGS

       - Volume 13, Issue 4 (ICAIIT 2025)
       - Volume 13, Issue 3 (ICAIIT 2025)
       - Volume 13, Issue 2 (ICAIIT 2025)
       - Volume 13, Issue 1 (ICAIIT 2025)
       - Volume 12, Issue 2 (ICAIIT 2024)
       - Volume 12, Issue 1 (ICAIIT 2024)
       - Volume 11, Issue 2 (ICAIIT 2023)
       - Volume 11, Issue 1 (ICAIIT 2023)
       - Volume 10, Issue 1 (ICAIIT 2022)
       - Volume 9, Issue 1 (ICAIIT 2021)
       - Volume 8, Issue 1 (ICAIIT 2020)
       - Volume 7, Issue 1 (ICAIIT 2019)
       - Volume 7, Issue 2 (ICAIIT 2019)
       - Volume 6, Issue 1 (ICAIIT 2018)
       - Volume 5, Issue 1 (ICAIIT 2017)
       - Volume 4, Issue 1 (ICAIIT 2016)
       - Volume 3, Issue 1 (ICAIIT 2015)
       - Volume 2, Issue 1 (ICAIIT 2014)
       - Volume 1, Issue 1 (ICAIIT 2013)


    PAST CONFERENCES

       ICAIIT 2025
         - Photos
         - Reports

       ICAIIT 2024
         - Photos
         - Reports

       ICAIIT 2023
         - Photos
         - Reports

       ICAIIT 2021
         - Photos
         - Reports

       ICAIIT 2020
         - Photos
         - Reports

       ICAIIT 2019
         - Photos
         - Reports

       ICAIIT 2018
         - Photos
         - Reports

    ETHICS IN PUBLICATIONS

    ACCOMODATION

    CONTACT US

 

        

         Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0


                                                   This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License


           ISSN 2199-8876
           Publisher: Edition Hochschule Anhalt
           Location: Anhalt University of Applied Sciences
           Email: leiterin.hsb@hs-anhalt.de
           Phone: +49 (0) 3496 67 5611
           Address: Building 01 - Red Building, Top floor, Room 425, Bernburger Str. 55, D-06366 Köthen, Germany

        site traffic counter

Creative Commons License
Except where otherwise noted, all works and proceedings on this site is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.