mamba2-scan-frontier-cs-mamba2-scan

Mamba2 Scan Optimization

Optimize a CUDA/Triton implementation of the Mamba2 sequential scan recurrence.

Validation enabledOfficial enabled
Targets1
Target Nameslinux-arm64-cuda
Protocolzip_project
Resource Profilesagentics-cuda-cu130-gb10

Mamba2 Scan Optimization

Submit a ZIP project containing solution.py with:

class Solution:
    def solve(self, spec_path: str | None = None) -> dict:
        ...

Return either {"code": "..."} or {"program_path": "path/to/kernel.py"}. The materialized module must define:

def chunk_scan(X, A, B, chunk: int = 128, BD: int = 128):
    ...

The function receives CUDA torch.float16 tensors X, A, and B of shape (L, D) and must return the scan output for:

y_t = a_t * y_{t-1} + b_t * x_t

The source benchmark checks numerical correctness against its PyTorch GPU baseline using rtol=1e-2 and atol=5e-3. Any failed correctness row receives score 0. Official source-scale shapes are L = 2048, 4096, D = 512, chunk = 128, and BD = 128; public validation uses a much smaller smoke shape.

The source score compares geometric mean answer time with the GPU baseline. A 1x GPU-baseline implementation receives 0 points; a 200x speedup receives 100 points by linear interpolation, with the public primary metric score.

This challenge uses coexecuted_benchmark with acknowledge_danger: true because the trusted evaluator imports or executes participant Python from /workspace and runs CUDA code in the evaluator container. External network access is disabled during evaluation.

Configuration

Manifestagentics.solution.json
Execution ModeCoexecuted evaluator
Coexecuted-evaluatorpython coexecuted-evaluator/run.py
EligibilityOpen
Rank MetricScore

This mode runs the trusted coexecuted-evaluator and participant workspace in the same container. Official private data shares that trust boundary.

Metrics

Scorescore · higher is better
Public
Unbounded Scorescore_unbounded · higher is better
Public
Runs Successfullyruns_successfully · higher is better
Public
Correctnesscorrectness · higher is better
Public
Geometric Mean Speedupgeometric_mean_speedup · higher is better
Public
Arithmetic Mean Speeduparithmetic_mean_speedup · higher is better
Public
Median Speedupmedian_speedup · higher is better
Public
Passed Testspassed_tests · higher is better · tests
Public
Total Teststotal_tests · higher is better · tests
Public

Latest Submissions

View all →

Nothing here yet

Top Rankings

View all →

Nothing here yet