Towards an integrated graph algebra for graph pattern matching with gremlin
Graph data management has revealed beneficial characteristics in terms of flexibility and scalability by differently balancing between query expressivity and schema flexibility. This has resulted into an rapid developing new task specific graph systems, query languages and data models, such as property graphs, key-value, wide column, resource description framework (RDF), etc. Present day graph query languages are focused towards flexible graph pattern matching (aka sub-graph matching), where as graph computing frameworks aim towards providing fast parallel (distributed) execution of instructions. The consequence of this rapid growth in the variety of graph based data management systems has resulted in a lack of standardization. Gremlin, a graph traversal language and machine, provides a common platform for supporting any graph computing system (such as an OLTP graph database or OLAP graph processors). We present a formalization of graph pattern matching for Gremlin queries. We also study, discuss and consolidate various existing graph algebra operators into an integrated graph algebra.