Bobox

A highly parallel framework for data processing

David Bednárek, Michal Brabec, Jiří Dokulil, Zbyněk Falt, Martin Kruliš, Petr Malý, Jakub Yaghob, Filip Zavoral, Miroslav Čermák

Description

The Bobox parallelization framework has two primary goals: simplify writing parallel, data-intensive programs and to serve as a testbed for developing generic parallel algorithms and data-oriented parallel algorithms. The main design principles are:
- all synchronization is hidden from the user, most technical details (Non-Uniform Memory Access, cache hierarchy, CPU architecture) are handled by the framework
- high-performance messaging is the only means of communication and synchronization
- utilizes easy to comprehend basic paradigms such as task parallelism and non-linear pipeline

The Bobox provides a run-time environment that is used to execute a generalized (non-linear) pipeline in parallel. The pipeline consists of computational components provided by the user and connecting parts that are part of the framework. The structure of the pipeline is defined by the user but the
communication and execution of individual parts is handled by the run-time; a component is ready to be executed when it has data waiting to be processed on its inputs. This simplifies the design of the individual computational components, since communication, synchronization and scheduling are handled by the framework.
Some of the design principles are similar to the TBB library - the Bobox framework also provides task level parallelism, but for a more specialized class of tasks. TBB and Bobox share several design principles, for example
fixed number of worker threads or task stealing; however, there are major differences in the organization of the task pool, and in the way data is handled.

Where to get it

Development version is available in a project SVN.

Links:

  • Public interface - a public web interface for running queries on our Bobox server

Contact email:

parg<at>ksi.mff.cuni.cz

Research

Research group at the department:

Parallel Architectures/Algorithms/Applications Research Group

Supporting research projects and grants:

GACR 201/09/0990, GAUK 277911, GAUK 472313, MSMT MSM0021620838, GACR P103-13-08195S, GACR P103-14-14292P, GACR P202/10/0761, GAUK SVV-2010-261312

Publications:

  • Bednárek D., Kruliš M., Yaghob J., Zavoral F.: Creating Distributed Execution Plans with BobolangNG, in Algorithms and Architectures for Parallel Processing, Granada, Springer International Publishing AG, ISBN: 978-3-319-49582-8, ISSN: 0302-9743, pp. 88-97, 2016 - text
  • Bednárek D., Kruliš M., Malý P., Yaghob J., Zavoral F., Pokorný J.: Combining Distributed Computing and Massively Parallel Devices to Accelerate Stream Data Processing, in Proceedings of the The Seventh International Conference on Advances in Databases, Knowledge, and Data Applications, Řím, IARIA, ISBN: 978-1-61208-408-4, ISSN: 2308-4332, pp. 1-8, 2015
  • Falt Z., Kruliš M., Bednárek D., Yaghob J., Zavoral F.: Towards Efficient Locality Aware Parallel Data Stream Processing, in Journal of Universal Computer Science, Vol. 21, Num. 6, ISSN: 0948-6968, pp. 816-841, 2015 - text
  • Krížik L., Falt Z., Čermák M., Zavoral F.: Using Static Code Analysis for Improvement of Job Data Availability in Bobox Task Scheduling, accepted for publication in Proceeedings of the 2nd International Conference on Communication and Computer Engineering, Phuket, Springer International Publishing, ISBN: 978-3-319-24582-9, ISSN: 1876-1100, pp. 1-8, 2015 - text
  • Krížik L., Zavoral F., Čermák M.: Using Static Code Analysis to Improve Coarse Task Granularity in Bobox, in The 10th International Conference on Digital Information Management - ICDIM 2015, Jeju, Korejská republika, IEEE, ISBN: 978-1-4673-9152-8, pp. 237-242, 2015
  • Čermák M., Zavoral F.: Achieving High Availability in D-Bobox, in DBKDA 2014 The Sixth International Conference on Advances in Databases, Knowledge, and Data Applications, Chamonix, IARIA, ISBN: 978-1-61208-334-6, pp. 92-97, 2014
  • Falt Z., Bednárek D., Kruliš M., Yaghob J., Zavoral F.: Bobolang - A Language for Parallel Streaming Applications, in Proceedings of the 23rd International ACM Symposium on High-Performance Parallel and Distributed Computing, Vancouver, ACM, ISBN: 978-1-4503-2749-7, pp. 311-314, 2014
  • Falt Z., Kruliš M., Bednárek D., Yaghob J., Zavoral F.: Locality Aware Task Scheduling in Parallel Data Stream Processing, in Proceedings of the 8th International Symposium on Intelligent Distributed Computing - IDC'2014, Madrid, Springer Verlag, ISBN: 978-3-319-10421-8, ISSN: 1860-949X, pp. 331-342, 2014
  • Brabec M., Bednárek D.: Programming parallel pipelines using non-parallel C# code, accepted for publication in CEUR Workshop Proceedings, Donovaly, Slovakia, CEUR-WS.org, ISSN: 1613-0073, pp. 82-87, 2013
  • Čermák M., Zavoral F.: Dosiahnutie vysokej dostupnosti v D-Boboxe, in ITAT 2013: Information Technologies—Applications and Theory Proceedings, Donovaly, Slovakia, CreateSpace Independent Publishing Platform, ISBN: 978-1-4909-5200-0, ISSN: 1613-0073, pp. 69-74, 2013
  • Falt Z., Čermák M., Zavoral F.: Highly Scalable Sort-Merge Join Algorithm for RDF Querying, in Proceedings of the International Conference on Data Technologies and Applications, Reyjkvajík, Island, SciTePress, ISBN: 978-989-8565-67-9, pp. 293-300, 2013
  • Falt Z., Kruliš M., Yaghob J.: Bobolang - jazyk pro systém Bobox, in ITAT 2013: Information Technologies—Applications and Theory (Proceedings), Donovaly, Slovensko, CreateSpace Independent Publishing Platform, ISBN: 978-1-4909-5200-0, ISSN: 1613-0073, pp. 75-81, 2013
  • Bednárek D., Dokulil J., Yaghob J., Zavoral F.: Bobox: Parallelization Framework for Data Processing, accepted for publication in Advances in Information Technology and Applied Computing, Denpasar, AITAC, ISSN: 2251-3418, pp. 189-194, 2012
  • Bednárek D., Dokulil J., Yaghob J., Zavoral F.: Data-Flow Awareness in Parallel Data Processing, in Intelligent Distributed Computing VI, Calabria, Springer, ISBN: 978-3-642-32523-6, ISSN: 1860-949X, pp. 149-154, 2012
  • Čermák M., Falt Z., Zavoral F.: D-Bobox: O distribuovatelnosti Boboxu, in Zborník príspevkov prezentovaných na konferencii Informačné technológie – Aplikácie a Teória, ITAT 2012, Monkova dolina, Slovensko, Neuveden, ISBN: 978-80-971144-1-1, pp. 41-46, 2012
  • Falt Z., Bednárek D., Čermák M., Zavoral F.: On Parallel Evaluation of SPARQL Queries, in The Fourth International Conference on Advances in Databases, Knowledge, and Data Applications DBKDA 2012, Saint Gilles, Reunion Island, Xpert Publishing Services, ISBN: 978-1-61208-185-4, pp. 97-102, 2012
  • Falt Z., Bulánek J., Yaghob J.: On Parallel Sorting of Data Streams, in Advances in Databases and Information Systems, Poznan, Poland, Springer, ISBN: 978-3-642-32740-7, ISSN: 2194-5357, pp. 69-77, 2012
  • Falt Z., Čermák M., Dokulil J., Zavoral F.: Parallel SPARQL Query Processing Using Bobox, in International Journal On Advances in Intelligent Systems, Vol. 5, Num. 3, ISSN: 1942-2679, pp. 302-314, 2012
  • Čermák M., Dokulil J., Falt Z.: Vyhodnocování SPARQL dotazů systémem Bobox, in Informačné Technológie - Aplikácie a Teória, Vrátná dolina, Slovensko, PONT s. r. o., ISBN: 978-80-89557-01-1, pp. 63-68, 2011
  • Čermák M., Falt Z., Dokulil J., Zavoral F.: SPARQL Query Processing Using Bobox Framework, in The Fifth International Conference on Advances in Semantic Processing SEMAPRO 2011, Lisbon, Portugal, Xpert Publishing Services, ISBN: 978-1-61208-175-5, pp. 104-109, 2011
  • Dokulil J., Bednárek D., Yaghob J.: The Bobox Project: Parallelization Framework and Server for Data Processing, technical report no. 2011/1, Department of Software Engineering, 39 pages, 2011 - WWW
  • Falt Z., Kruliš M., Yaghob J.: Optimalizace třídicích algoritmů pro systémy proudového zpracování dat, in Informačné Technológie - Aplikácie a Teória, Vrátná dolina, PONT s. r. o., ISBN: 978-80-89557-01-1, pp. 69-74, 2011
  • Falt Z., Yaghob J.: Task Scheduling in Data Stream Processing, in Proceedings of the Dateso 2011 Workshop on DAtabases, TExts, Specifications and Objects, Písek, Czech Republic, Vysoká škola báňská - Technická univerzita Ostrava, ISBN: 978-80-248-2391-1, pp. 85-96, 2011
  • Bednárek D.: R-Programs: A Framework for Distributing XML Structural Joins across Function Calls, in Lecture Notes in Computer Science, Vol. 2010, Num. 5901, ISSN: 0302-9743, pp. 176-187, 2010 - SpringerLink
  • Bednárek D., Dokulil J.: TriQuery: Modifying XQuery for RDF and Relational Data, in 2010 Workshops on Database and Expert Systems Applications, Bilbao, Spain, IEEE Computer Society, ISBN: 978-0-7695-4174-7, ISSN: 1529-4188, pp. 342-346, 2010 - IEEE
  • Čermák M., Dokulil J., Zavoral F.: SPARQL Compiler for Bobox, in The Fourth International Conference on Advances in Semantic Processing, Florence, Italy, Xpert Publishing Services, ISBN: 978-1-61208-000-0, pp. 100-105, 2010
  • Dokulil J., Katreniaková J.: Bobox Model Visualization, in 2010 14th International Conference Information Visualisation, London, UK, IEEE Computer Society, ISBN: 978-0-7695-4165-5, ISSN: 1550-6037, pp. 537-542, 2010 - WWW
  • Bednárek D.: Bulk Evaluation of User-Defined Functions in XQuery, Ph.D. thesis, 158 pages, 2009 - WWW
  • Bednárek D., Dokulil J., Yaghob J., Zavoral F.: The Bobox Project - A Parallel Native Repository for Semi-structured Data and the Semantic Web, in ITAT 2009 - IX. Informačné technológie - aplikácie a teória, PONT Slovakia, ISBN: 978-80-970179-1-0, pp. 44-59, September 2009
  • Bednárek D., Dokulil J., Yaghob J., Zavoral F.: Using Methods of Parallel Semi-structured Data Processing for SemanticWeb, in 3rd International Conference on Advances in Semantic Processing, SEMAPRO, Sliema, Malta, IEEE Computer Society Press, ISBN: 978-0-7695-3833-4, pp. 44-49, October 2009
  • Bednárek D.: Output-Driven XQuery Evaluation, in 2nd International Symposium on Intelligent Distributed Computing, Springer-Verlag, ISBN: 978-3-540-85256-8, ISSN: 1860-949X, pp. 55-64, September 2008 - WWW
The content of this web site is licensed under Creative Commons Attribution-NonCommercial 3.0 Czech Republic