xMapReduce Core Concepts

This guide explains the architecture and design principles of the xMapReduce framework, how it leverages XCENA’s PXL (Parallel Execution Library) for efficient parallel data processing.

What is MapReduce?

MapReduce is a programming model designed for processing large datasets across distributed systems. It was originally developed by Google and has become a standard approach for parallel computing. The model takes its name from two main operations:

  • Map: Process each piece of input data independently
  • Reduce: Combine intermediate results into final outputs

This approach makes it possible to handle datasets that are too large for a single computer to process efficiently.

xMapReduce Architecture

xMapReduce implements the MapReduce model using a three-phase process:

  1. Map Phase: Input data is divided into chunks and processed by multiple mappers in parallel
  2. Local Reduce Phase: Mapped data is processed by reducers that perform local aggregation
  3. Global Reduce Phase: Results from local reducers are combined to produce the final output

xMapReduce Architecture Diagram

Key Components

MapReduce Class

The central component that orchestrates the entire pipeline:

  • Manages parallel execution of map and reduce operations
  • Handles data distribution and collection
  • Controls memory allocation through KVStoreManager
  • Supports execution on both host CPU and XCENA accelerator
// Example of creating and using a MapReduce instance
// Method 1: Direct constructor
auto mapReduce = std::make_unique<xmapreduce::MapReduce<MyMapper, MyReducer>>(numMapTasks, numReduceTasks, resultCapacity);  
mapReduce->initialize();  
mapReduce->execute(inputData, dataSize);  

// Method 2: Builder pattern (newer implementation)
auto mapReduce = xmapreduce::MapReduce<MyMapper, MyReducer>::builder()  
                     .numMaps(8)
                     .numReduces(4)
                     .numTasks(2)
                     .resultCapacity(1024)
                     .build();
mapReduce->execute(inputData, dataSize);

Mapper Interface

Base interface for implementing custom mapping logic:

class MyMapper : public xmapreduce::Mapper<InputType, xmapreduce::KeyValuePair<KeyType, ValueType>>
{
public:
    void mapImpl(InputType* data, int count, xkvstore::KVStore<KeyType, ValueType>* kvStore) override
    {
        // Map implementation
        for (int i = 0; i < count; i++) {
            // Process data[i] and emit key-value pairs
            kvStore->insert(key, value);
        }
    }
};

// Register mapper for dynamic creation at runtime
REGISTER_MAPPER(MyMapper);

Reducer Interface

Base interface for implementing custom reduction logic:

class MyReducer : public xmapreduce::Reducer<xmapreduce::KeyValuePair<KeyType, ValueType>, xmapreduce::KeyValuePair<KeyType, ValueType>>
{
public:
    void reduceImpl(xkvstore::KVStore<KeyType, ValueType>* input, xkvstore::KVStore<KeyType, ValueType>* output) override
    {
        // Reduce implementation
        // Process all values for each key and emit results
        std::unordered_map<KeyType, ValueType> aggregate;
        for (const auto& kv : *input)
        {
            aggregate[kv.getKey()] += kv.getValue();
        }
        for (const auto& [key, value] : aggregate)
        {
            output->insert(key, value);
        }
    }
};

// Register reducer for dynamic creation at runtime
REGISTER_REDUCER(MyReducer);

KVStore

Container for key-value pairs with efficient parallel access:

  • Supports concurrent insertion from multiple tasks
  • Contains multiple blocks for dynamic memory expansion
  • Each block divided into regions for task-specific access
  • Provides iterators for traversing stored data
  • Manages memory allocations through the KVStoreManager

Block and Region Memory Structure

For efficient parallel memory management, xMapReduce uses a block-based approach:

  • Block: Basic memory allocation unit containing multiple regions
  • Region: Task-specific memory area within a block

This structure prevents memory conflicts during parallel execution and enables efficient memory expansion when needed.

Execution Flow

1. Initialization

When a MapReduce job is initialized:

  1. The MapReduce object is created with mapper and reducer types
  2. PXL runtime is initialized for parallel execution
  3. KVStoreManager creates KVStores for map and reduce phases
  4. Memory blocks and regions are allocated
  5. MapReduce registers the mapper and reducer with the registry system

2. Map Execution

During the map phase:

  1. Input data is divided among map tasks
  2. Each mapper processes its assigned data chunk in parallel
  3. Mappers emit key-value pairs into the map KVStore
  4. If a region becomes full, a new block is automatically allocated

3. Local Reduce Execution

During the local reduce phase:

  1. Map results are grouped by key
  2. Each reducer processes values for its assigned keys
  3. Reducers emit aggregated results to the local reduce KVStore

4. Global Reduce

After local reduction:

  1. Results from all local reducers are combined using KVStoreManager
  2. Final aggregation is performed across all intermediate results
  3. Results are made available through the result KVStore and can be retrieved with getResultKVStore()

Memory Management

xMapReduce employs a block and region-based memory allocation system:

Block Allocation

When a KVStore needs more memory:

  1. Initial insert attempt may fail if the region is full
  2. KVStoreManager is notified to allocate a new block
  3. New block is created with multiple regions
  4. Block is added to the KVStore’s block list
  5. The failed insert operation is retried

This dynamic block allocation enables efficient handling of varying data sizes without manual memory management.

Common MapReduce Patterns

Several common patterns emerge in MapReduce applications:

  • Filtering: Mappers emit only items meeting certain criteria
  • Counting: Count occurrences of items (like in word count)
  • Summation: Aggregate numerical values
  • Averaging: Calculate averages for grouped data
  • Indexing: Create indexes or inverted indexes for data retrieval

When to Use xMapReduce

xMapReduce works well for tasks that:

  • Can be broken into independent sub-problems
  • Involve processing each record in a similar way
  • Require aggregating or combining intermediate results
  • Benefit from parallel processing

For more examples, see the Examples page.