Sztabina: Define and implement Git repository introspection contract (SZ-55)
rk@tigase.net opened 5 days ago

**Foundational infrastructure for provider-agnostic Git integration **

This issue establishes the minimal, stable Git contract that Sztabina exposes to the rest of Sztab.
Everything here is intentionally UI-agnostic and provider-neutral.


Description

Sztab requires a consistent, reliable way to query Git repositories regardless of the backing Git server (Forgejo, OneDev, future providers).

This issue defines and implements the Git introspection contract that Sztabina must support, using plain git commands rather than provider-specific APIs or CLIs.

This contract is the foundation for:

  • Creating pull requests from commits
  • Role-based PR views
  • Linking commits to issues via issue IDs in commit messages
  • Future commit-level UX (history, traceability, audits)

Everything beyond this contract is an implementation detail.


Scope

Repository-level (static / low churn)

Must work identically for Forgejo and OneDev.

  • Default branch
    • name (e.g. main)
    • HEAD commit SHA
  • Branch list
    • branch name
    • isDefault flag
    • last commit SHA

Enables:

  • PR base branch selection
  • “Create PR from commit” sanity checks

Commit-level (core value, non-negotiable)

Minimum viable data:

  • Commit SHA
  • Author (name + email)
  • Authored timestamp
  • Full commit message (not truncated)
  • Parent commit SHAs
  • Branch containment
    • at least: whether commit is reachable from default branch

Optional but valuable (Phase 1.5 if cheap):

  • Changed file paths (no diffs)
  • Commit URL (deep link when provider supports it)

Enables:

  • Issue ↔ commit linking (issue ID in commit message)
  • Creating PRs from commits
  • “Commits relevant to issue” views
    (addresses Wojciech’s feature request)

Comment-level (defined now, implemented later)

Phase 2 – do not block this issue

  • Commit comments
    • id
    • author
    • body
    • timestamp

I am stating this early intentionally to keep the contract forward-compatible.


Explicit Non-Requirements

  • No UI work
  • No PR creation logic
  • No provider-specific APIs
  • No webhooks or event ingestion
  • No diff rendering

This issue is pure contract + data acquisition.


Acceptance Criteria

  • Same inputs → same outputs for Forgejo and OneDev
  • Contract is sufficient to:
    • determine default branch
    • list branches
    • inspect commits
    • associate commits with issues by message parsing
  • No UI or PR code required for closure

  • rk@tigase.net commented 5 days ago

    Implementation Notes

    • Use plain git CLI commands only
    • Repository is assumed to be locally available / cloned
    • Output should be normalized into provider-agnostic DTOs
    • Errors must be explicit and diagnosable (missing repo, invalid ref, etc.)

    Assumptions

    • Sztabina executes Git commands inside a checked-out working copy
    • Repo is already cloned and authenticated
    • No reliance on Git hosting APIs
    • **Git version ≥ 2.20**
    
  • rk@tigase.net commented 5 days ago

    Worklog

    •  Finalize Git contract (repo / branch / commit)
    •  Document exact git commands used per contract item
    •  Implement repository-level queries
    •  Implement commit-level queries
    •  Normalize output into stable DTOs
    •  Verify behavior against:
      • Forgejo-managed repository
      • OneDev-managed repository
    •  Add minimal sanity tests / scripts
    •  Document assumptions and limitations

  • rk@tigase.net commented 5 days ago
    • Enables: Issue-commit linking (SZ-34) (requested by Wojciech)
      This issue provides the required infrastructure but does not implement the UX.

  • rk@tigase.net commented 5 days ago

    Detailed Worklog & Time Estimates

    This estimate covers implementing the Git query contract for Sztabina using pure git commands, validated against Forgejo and OneDev.


    Phase 0 — Interface Finalization & Git Command Mapping

    Estimated time: 1.0 – 1.5 hours

    Tasks

    • Re-read and lock the interface (repository, branch, commit, comment levels)
    • Decide exact git commands per contract item
    • Document assumptions:
      • local clone available
      • non-bare repo vs bare repo handling
      • shallow clone behavior
    • Define error semantics (missing default branch, detached HEAD, etc.)

    Deliverables

    • Finalized contract (no TODOs)
    • Mapping table: Contract Field → Git Command
    • Notes on known limitations

    Phase 1 — Repository-Level Queries

    Estimated time: 1.0 – 1.5 hours

    Tasks

    • Implement default branch detection
      (git symbolic-ref refs/remotes/origin/HEAD fallback logic)
    • Retrieve branch list
    • Determine:
      • branch name
      • default flag
      • last commit SHA per branch
    • Normalize output into Sztab DTOs (defined already)

    Deliverables

    • RepositoryInfo structure
    • BranchInfo list
    • Unit-level validation via CLI runs

    Phase 2 — Commit-Level Queries (Core Value)

    Estimated time: 2.0 – 2.5 hours

    Tasks

    • Implement commit metadata extraction:
      • SHA
      • author name/email
      • authored timestamp
      • full commit message
      • parent SHAs
    • Implement branch containment checks:
      • default branch presence
      • all containing branches (if cheap)
    • Optional (but likely included):
      • changed file paths (git show --name-only)
    • Generate commit deep-link URLs (provider-agnostic template)

    Deliverables

    • CommitInfo structure
    • Support for “commits relevant to issue”
    • Foundation for PR-from-commit

    Phase 3 — Provider Verification (Forgejo + OneDev)

    Estimated time: 1.5 – 2.0 hours

    Tasks

    • Run all queries against Forgejo-managed repo
    • Run all queries against OneDev-managed repo
    • Identify provider quirks:
      • default branch naming
      • HEAD resolution
      • refs layout
    • Fix implementation without changing contract

    Deliverables

    • Verified compatibility with both providers
    • Provider-specific notes (non-breaking)

    Phase 4 — Hardening & Edge Cases

    Estimated time: 0.75 – 1.0 hour

    Tasks

    • Handle empty repos
    • Handle detached HEAD
    • Handle shallow clones gracefully
    • Improve error messages and logging

    Deliverables

    • Defensive code paths
    • Clear failure semantics

    Phase 5 — Documentation & Handoff

    Estimated time: 0.75 – 1.0 hour

    Tasks

    • Inline code documentation
    • README / design note:
      • what Sztabina queries
      • what it intentionally does NOT query
    • Notes for future Phase 2 (commit comments)

    Deliverables

    • Developer-facing documentation
    • Clean handoff to PR / Issue workflows

    Total Estimated Time

    PhaseTime (hours)
    Phase 01.0 – 1.5
    Phase 11.0 – 1.5
    Phase 22.0 – 2.5
    Phase 31.5 – 2.0
    Phase 40.75 – 1.0
    Phase 50.75 – 1.0
    Total7.5 – 9.5 hours

    Notes

    • This work is foundational, not cosmetic.
    • It directly enables:
      • issue ↔ commit linking
      • PR creation from commits
      • Wojciech’s requested functionality
    • No UI work included.
    • No provider-specific APIs used.
    • Everything beyond this becomes cheaper once this exists.
  • rk@tigase.net commented 5 days ago
    rksuma@Ramakrishnans-MacBook-Pro sztab % git checkout wolnosc
    Switched to branch 'wolnosc'
    Your branch is up to date with 'origin/wolnosc'.
    rksuma@Ramakrishnans-MacBook-Pro sztab %
    rksuma@Ramakrishnans-MacBook-Pro sztab % git pull
    Already up to date.
    rksuma@Ramakrishnans-MacBook-Pro sztab %
    rksuma@Ramakrishnans-MacBook-Pro sztab % git checkout -b feature/SZ-55-sztabina-git-introspection
    Switched to a new branch 'feature/SZ-55-sztabina-git-introspection'
    rksuma@Ramakrishnans-MacBook-Pro sztab % 
    
    
  • rk@tigase.net changed state to 'In Progress' 5 days ago
    Previous Value Current Value
    Open
    In Progress
  • rk@tigase.net commented 5 days ago

    Sztabina Git Discovery Contract (CLI-level)

    This defines the exact Git CLI–level contract Sztabina uses to introspect managed Git repositories.

    This is the contract. Everything else (language, adapters, APIs, storage) is implementation detail.

    All commands are:

    • Read-only
    • Deterministic
    • Portable across Forgejo, OneDev, bare Git repos
    • Fail-fast (no guessing)

    Invariant

    Sztabina never guesses. If Git does not know, Sztabina returns an explicit error.


    resolveDefaultBranch()

    resolveDefaultBranch(): if symbolic-ref refs/remotes/origin/HEAD exists: return that branch

    else if git remote show origin yields "HEAD branch":
        return that branch
    
    else:
        return ErrDefaultBranchUnknown
    

    Commands:

    • git symbolic-ref refs/remotes/origin/HEAD
    • git remote show origin

    resolveRepositoryName()

    resolveRepositoryName(): if remote "origin" exists: parse repository name from remote URL return repo name

    else:
        return ErrNoRemoteOrigin
    

    Command:

    • git remote get-url origin

    Notes:

    • Strip .git
    • Last path segment only
    • Works for HTTPS and SSH URLs

    resolveHeadCommit(defaultBranch)

    resolveHeadCommit(defaultBranch): if refs/remotes/origin/ exists: return commit SHA

    else if refs/heads/<defaultBranch> exists:
        return commit SHA
    
    else:
        return ErrDefaultBranchNotFetched
    

    Commands:

    • git rev-parse refs/remotes/origin/
    • git rev-parse refs/heads/

    listBranches()

    listBranches(): branches = []

    for each remote branch under refs/remotes/origin/*:
        branch.name = short name
        branch.lastCommitSha = commit SHA
        branch.isDefault = (branch.name == resolveDefaultBranch())
    
        append branch
    
    return branches
    

    Command:

    • git for-each-ref refs/remotes/origin --format="%(refname:short) %(objectname)"

    Notes:

    • Exclude origin/HEAD
    • Local branches are not required

    listCommits(limit, branch?)

    listCommits(limit, branch?): target = branch if provided else resolveDefaultBranch()

    if target does not exist:
        return ErrUnknownBranch
    
    return commits ordered by authored date (desc)
    

    Command: git log --max-count= --format="%H|%an|%ae|%at|%B|%P"

    Fields extracted:

    • commit SHA
    • author name
    • author email
    • authored timestamp
    • full commit message (not truncated)
    • parent SHAs

    commitExists(sha)

    commitExists(sha): if git cat-file -e succeeds: return true else: return false

    Command:

    • git cat-file -e ^{commit}

    branchesContainingCommit(sha)

    branchesContainingCommit(sha): if commit does not exist: return ErrUnknownCommit

    return list of remote branches containing sha
    

    Command:

    • git branch -r --contains

    Post-processing:

    • Strip origin/
    • Exclude origin/HEAD

    isCommitOnDefaultBranch(sha)

    isCommitOnDefaultBranch(sha): defaultBranch = resolveDefaultBranch()

    if defaultBranch in branchesContainingCommit(sha):
        return true
    else:
        return false
    

    listChangedFiles(sha) (optional but cheap)

    listChangedFiles(sha): if commit does not exist: return ErrUnknownCommit

    return file paths changed in commit
    

    Command:

    • git diff-tree --no-commit-id --name-only -r

    resolveCommitUrl(sha) (best-effort)

    resolveCommitUrl(sha): if remote URL matches Forgejo / Gitea / OneDev pattern: construct commit URL return URL

    else:
        return None
    

    Notes:

    • Optional metadata
    • Never blocks core workflows

    listCommitComments(sha) — Phase 2 (defined, not implemented)

    listCommitComments(sha): not implemented via git CLI delegated to repository-specific API adapters


    What this enables immediately

    • PR base selection
    • Safe PR creation from commit
    • Commit ↔ Issue linking via commit message
    • “Commits relevant to issue”
    • Cross-repo support (Forgejo + OneDev) with identical behavior

    Summary

    This contract is:

    • Minimal
    • Deterministic
    • Backend-only
    • UI-agnostic
    • Repository-provider-agnostic

    Once this contract is stable, Sztabina becomes a reliable Git intelligence engine rather than a UI-driven guesser.

issue 1 of 1
Type
New Feature
Priority
Major
Assignee
Version
1.0
Sprints
n/a
Customer
n/a
Issue Votes (0)
Watchers (3)
Reference
SZ-55
Please wait...
Page is in error, reload to recover