Protection from AI bots and crawlers (SZ-73)
Artur Hefczyc opened 2 months ago

The main problem we faced was our servers overload by AI bots and crawlers. The most sensible solution seems to hide resource heavy from anonymous or guests access. I suggested to make these operations accessible based on user permissions. This would give us the most flexibility.

  • rk@tigase.net changed state to 'In Progress' 3 weeks ago
    Previous Value Current Value
    Open
    In Progress
  • rk@tigase.net commented 3 weeks ago

    Bot Attack Surface Area

    Screenshot 2026-03-10 at 6.03.11 PM.png

    Description

    1) Test Approach:

    • Choose tools to measure impact of Bots
    • Choose tools to induce Bot like stress
    • Form a baseline of resource usage with Bot attack before applying Bot guardrails
    • After each layer is added — verify the layer holds under the same load. We would watch the sztab-backend and sztabina pods specifically during bot stress tests.

    2) Identify tools to measure impact of Bot (CPU usage or I/O usage)

    • Grafana + Prometheus — we already have this or it's easy to add to the cluster via Helm.
    • Gives us CPU, memory, and network I/O per pod.

    3) Identify tools to induce Bot-like stress

    A) k6 — open source load testing tool

    We can write scripts in TypeScript and simulate concurrent anonymous/bot traffic against specific endpoints.

    Example:

    import http, { RefinedResponse, ResponseType } from 'k6/http';
    import { check } from 'k6';
    
    export default function (): void {
      const res: RefinedResponse<ResponseType> = http.get(
        'https://staging.sztab.com/api/projects/1/pulls/5/diff',
        {
          headers: { 'User-Agent': 'GPTBot/1.0' },
        }
      );
    
      check(res, {
        'status is 200': (r) => r.status === 200,
      });
    }
    

    B) Java with Gatling

    This is essentially the Java equivalent of k6. Since the broader Tigase team is Java-first, Gatling scripts would feel more natural to them and fit into Maven builds. Shall I use this option? This way the Bot simulation scripts an be reused for other Tigase projects.

    Kotlin developers can use Gatling in Kotlin; Java developers can use Gatling in Java

    C) JMX

    JMX scripts can serve a dual purpose:

    • Bot simulation
    • Stress test

    However, k6 is frictionless and will work "out of the box".


    4) Layered approach to Bot mitigation

    a) Layer 1: Spring Security — anonymous request blocking
    (lowest effort, highest impact)

    b) Layer 2: Caddy — rate limiting + bot filtering at the edge
    (before Spring even sees the request)

    c) Layer 3: robots.txt (soft signal, respected by well-behaved bots)

    d) Layer 4: Permission-based access (Artur's suggestion — most flexible)


    4.1 Layer 1

    The simplest way is to identify the most expensive APIs and mandate authentication for shortlisted APIs.

    With Spring this is easy: in the Spring Security policy add .authenticated() for such endpoints.

    APIs that trigger git clone and git merge are candidates.


    4.2 Layer 2

    Since Caddy is already our reverse proxy with forward_auth, we can add:

    # Rate limiting for anonymous traffic
    @anonymous not header Authorization *
    @anonymous not header Cookie *
    rate_limit @anonymous 10r/m
    
    # Block known bot user agents
    @bots header_regexp User-Agent `(?i)(GPTBot|ClaudeBot|CCBot|Bytespider|SemrushBot|AhrefsBot)`
    respond @bots 403
    

    This stops bots before they consume Spring Boot or Sztabina resources at all.


    4.3 Layer 3 — robots.txt

    Serve a robots.txt from Caddy directly blocking AI crawlers:

    User-agent: GPTBot
    Disallow: /
    
    User-agent: ClaudeBot
    Disallow: /
    
    User-agent: CCBot
    Disallow: /
    
    User-agent: *
    Disallow: /api/
    Allow: /
    

    This is a soft signal, respected only by well-behaved crawlers.


    4.4 Layer 4 — Permission-based access

    This is the existing ExternalUserPolicy / role system extended with a new dimension.

    Instead of just authenticated vs anonymous, we gate by role.

    Example:

    @PreAuthorize("hasPermission(#projectId, 'Project', 'READ_DIFFS')")
    public DiffResponse getPullRequestDiff(...) { ... }
    

    Roles like GUEST / COMMUNITY could be explicitly excluded from diff/search endpoints even if authenticated.

    This is useful if we ever allow public read-only accounts but still want to protect expensive resources.

    4.5 Layer 5 (Host Layer) — Using Host IDS (such as OSSEC)

    OSSEC / Wazuh (OSSEC's modern fork) can help — it does log analysis, anomaly detection, and can trigger active responses (e.g. auto-ban an IP via iptables). But I think for now this may be an overkill in Sztab's context.

    Known limitations

    • Authenticated scenario uses a single shared session cookie across all 20 VUs. Real bot farms distribute load across multiple accounts/sessions. A more realistic simulation would create 5-10 bot accounts and distribute cookies among VUs — deferred to a later iteration.
  • rk@tigase.net commented 3 weeks ago
    rksuma@Ramakrishnans-MacBook-Pro sztab % git checkout -b feature/SZ-73-Protection-from-AI-bots-and-crawlers 
    Switched to a new branch 'feature/SZ-73-Protection-from-AI-bots-and-crawlers'
    rksuma@Ramakrishnans-MacBook-Pro sztab %
    
  • rk@tigase.net commented 3 weeks ago

    I have assumed that the Bots/crawlers can cause performance issues alone by exhausting resources.

    But bots can also attempt privilege escalation. Hence this issue is in part about security posture as well.

    Data harvesting is another risk: A crawler indexing all the issues, PRs, comments, and code — even if read-only, this is a confidentiality problem for private projects and can be used for competitor intelligence gathering.

    Please let me know if we should treat this as a performance issue alone in this rev.

  • rk@tigase.net commented 3 weeks ago

    Monitoring tool

    Phase 1 (immediate) — kubectl top for CPU/memory across the three pods during stress tests. Free, zero setup, good enough to establish baseline.

    Phase 2 (proper) — add node_exporter to the EC2 node for disk I/O, feed into Grafana alongside Caddy metrics. Full picture.

  • rk@tigase.net commented 3 weeks ago

    SZ-73 Bot Protection — Baseline Measurements

    Purpose

    Establish pre-mitigation resource usage baseline on staging, before any bot protection layers are applied. These numbers will be used to validate the effectiveness of each mitigation layer as it is implemented.

    Environment

    • Cluster: k3s on AWS EC2 (us-west-2)
    • Host: ec2-35-87-145-56.us-west-2.compute.amazonaws.com
    • Namespace: sztab-staging
    • Image tag: sz73-bot-protection (rebased on wolnosc, no SZ-73 changes applied yet)
    • Date: 2026-03-12

    Idle Baseline (no load)

    Captured via kubectl top pods -n sztab-staging with no active traffic.

    PodCPU (cores)Memory
    sztab-backend5m369Mi
    sztab-db4m46Mi
    sztabina1m1Mi
    caddy1m10Mi
    sztab-ui1m2Mi

    Notes:

    • sztab-backend memory at 369Mi reflects normal Spring Boot JVM baseline (expected)
    • sztabina and caddy are effectively idle
    • sztab-db at 4m CPU reflects background PostgreSQL activity only

    Bot Stress Baseline (under simulated load)

    TODO: Run k6 stress test simulating anonymous bot traffic against expensive endpoints. Capture CPU and memory spike for sztab-backend, sztabina, and sztab-db.

    Target Endpoints

    EndpointWhy Expensive
    GET /api/projects/{id}/pulls/{id}/diffTriggers git diff via Sztabina
    GET /api/projects/{id}/issues?q=...DSL query, DB-heavy
    GET /api/projects/{id}/files/{branch}Git tree traversal via Sztabina

    k6 Test Parameters

    • Virtual users: TBD
    • Duration: TBD
    • User-Agent: GPTBot/1.0 (simulates AI crawler)
    • Auth: none (anonymous)

    Results

    TODO: Fill in after k6 run.

    PodCPU (cores)MemoryDelta vs Idle
    sztab-backend---
    sztab-db---
    sztabina---

    Post-Mitigation Measurements

    TODO: Re-run same k6 test after each layer is applied and record results here.

    LayerDescriptionBackend CPUSztabina CPUNotes
    Layer 1Spring Security .authenticated()---
    Layer 2Caddy rate limiting + bot UA blocking---
    Layer 3robots.txt--soft signal only
    Layer 4Permission-based access (role gating)---
  • rk@tigase.net commented 3 weeks ago

    Next step: install k6 on my laptop:

    rksuma@Ramakrishnans-MacBook-Pro sztab %  brew install k6
    //...
    rksuma@Ramakrishnans-MacBook-Pro sztab %  k6 version
    k6 v1.6.1 (commit/devel, go1.26.0, darwin/arm64)
    rksuma@Ramakrishnans-MacBook-Pro sztab % 
    

    Now, I'll write a Typescript script targeting the three expensive endpoints with a GPTBot user agent, no auth, and enough virtual users to actually stress the backend.

  • rk@tigase.net referenced from other issue 2 weeks ago
  • rk@tigase.net commented 2 weeks ago

    Results of Layer 1 testing after locking down all expensive methods with .authrequired() => (please disregard the spurious error at the end in deleting the test project)

    Essentially since the Bot does not authenticate itself, it runs into http/403 for all hits and hence makes no difference to the resource usage of Sztab.

    
    rksuma@Ramakrishnans-MacBook-Pro sztab % ADMIN_USER=admin ADMIN_PASSWORD=SztabStagingAdmin! ./scripts/stress-test/k6/run-stress-test.sh
    [INFO]  === SZ-73 Bot Stress Test ===
    [INFO]  Base URL:    http://ec2-35-87-145-56.us-west-2.compute.amazonaws.com
    [INFO]  Namespace:   sztab-staging
    [INFO]  VUs:         50
    [INFO]  Duration:    60s
    [INFO]  --- Step 1: Login ---
    [INFO]  Login successful.
    [INFO]  Logged in as user id=1
    [INFO]  --- Step 2: Create Sztab project ---
    [INFO]  Project 'SZ73 Stress Test' already exists — looking up existing project...
    [INFO]  Found existing project: id=16
    [INFO]  --- Step 3: Create issue ---
    [INFO]  Issue created: id=3
    [INFO]  --- Step 4: Create pull request ---
    [INFO]  Pull request created: id=3
    [INFO]  --- Step 5: Baseline pod metrics (idle) ---
    NAME                            CPU(cores)   MEMORY(bytes)   
    caddy-847774bbf9-xzvnv          1m           12Mi            
    sztab-backend-644c77d58-r46xd   2m           432Mi           
    sztab-db-fb967c9d5-fs84w        2m           44Mi            
    sztab-ui-57764ffc4f-r9hlg       1m           3Mi             
    sztabina-65b5cff756-kzl4f       1m           3Mi             
    [INFO]  --- Step 6: Run k6 stress test ---
    [INFO]  Watch pod metrics in another terminal: kubectl top pods -n sztab-staging --watch
    
             /\      Grafana   /‾‾/  
        /\  /  \     |\  __   /  /   
       /  \/    \    | |/ /  /   ‾‾\ 
      /          \   |   (  |  (‾)  |
     / __________ \  |_|\_\  \_____/ 
    
    
         execution: local
            script: /Users/rksuma/tigase/sztab/scripts/stress-test/k6/bot-stress-test.ts
            output: -
    
         scenarios: (100.00%) 1 scenario, 50 max VUs, 1m30s max duration (incl. graceful stop):
                  * default: 50 looping VUs for 1m0s (gracefulStop: 30s)
    
    
    
      █ THRESHOLDS 
    
        http_req_duration
        ✓ 'p(95)<5000' p(95)=134.53ms
    
    
      █ TOTAL RESULTS 
    
        checks_total.......: 69856  1161.339279/s
        checks_succeeded...: 25.00% 17464 out of 69856
        checks_failed......: 75.00% 52392 out of 69856
    
        ✗ status is 200 (unprotected)
          ↳  0% — ✓ 0 / ✗ 17464
        ✗ status is 401 (auth required)
          ↳  0% — ✓ 0 / ✗ 17464
        ✓ status is 403 (bot blocked)
        ✗ status is 429 (rate limited)
          ↳  0% — ✓ 0 / ✗ 17464
    
        HTTP
        http_req_duration....: avg=71.19ms  min=29.65ms  med=55.02ms  max=422.05ms p(90)=124.52ms p(95)=134.53ms
        http_req_failed......: 100.00% 17464 out of 17464
        http_reqs............: 17464   290.33482/s
    
        EXECUTION
        iteration_duration...: avg=172.15ms min=130.26ms med=155.68ms max=522.51ms p(90)=225.17ms p(95)=235.3ms 
        iterations...........: 17464   290.33482/s
        vus..................: 50      min=50             max=50
        vus_max..............: 50      min=50             max=50
    
        NETWORK
        data_received........: 7.8 MB  129 kB/s
        data_sent............: 2.3 MB  38 kB/s
    
    
    
    
    running (1m00.2s), 00/50 VUs, 17464 complete and 0 interrupted iterations
    default ✓ [======================================] 50 VUs  1m0s
    [INFO]  --- Step 7: Pod metrics (post-stress) ---
    NAME                            CPU(cores)   MEMORY(bytes)   
    caddy-847774bbf9-xzvnv          99m          20Mi            
    sztab-backend-644c77d58-r46xd   252m         440Mi           
    sztab-db-fb967c9d5-fs84w        2m           45Mi            
    sztab-ui-57764ffc4f-r9hlg       1m           3Mi             
    sztabina-65b5cff756-kzl4f       1m           4Mi             
    [INFO]  === Stress test complete. Teardown will run now. ===
    [INFO]  --- Teardown ---
    [INFO]  Deleting Sztab project 16...
    [ERROR] Failed to delete project 16
    [INFO]  Teardown complete.
    rksuma@Ramakrishnans-MacBook-Pro sztab %
    
  • rk@tigase.net commented 2 weeks ago

    Baseline stress test results (pre-protection, 2026-03-14)

    Ran k6 stress test against staging (ec2-35-87-145-56.us-west-2.compute.amazonaws.com) with 50 VUs for 60s — 30 unauthenticated (anonymous bot simulation) and 20 authenticated (bot with DEVELOPER role, hitting issues/PR/branch endpoints).

    Throughput: 279 req/s

    Pod metrics (idle → under load)

    PodCPU idleCPU loadMemory idleMemory load
    sztab-backend2m370m443Mi544Mi
    sztab-db4m137m46Mi77Mi
    caddy1m117m23Mi23Mi
    sztabina1m1m2Mi2Mi

    Observations

    • Unauthenticated requests: 100% returning 403 -- Layer 1 (Spring Security) blocking all anonymous traffic correctly.
    • Authenticated requests: 100% returning 200 -- DEVELOPER role has correct read access.
    • Backend CPU peaks at 370m under load -- this is the baseline to beat after Caddy rate limiting is applied.
    • DB CPU peaks at 137m -- issue/PR list queries are the likely driver.
    • Sztabina unaffected -- git ops not triggered by read-only REST traffic.

    Known limitations

    • Authenticated scenario uses a single shared session cookie across all 20 VUs. Real bot farms distribute load across multiple accounts/sessions. A more realistic simulation would create 5-10 bot accounts and distribute cookies among VUs -- deferred to a later iteration.

    Next steps

    Implement Layer 2 (Caddy rate limiting) and re-run to measure impact.

  • rk@tigase.net commented 2 weeks ago

    Layer 2: Caddy-level rate limiting and bot blocking

    Rejection is now pushed upstream to the reverse proxy, before requests ever reach the JVM. I added two defenses to the Caddyfile:

    • UA blocklist -- known well-behaved AI crawlers (GPTBot, ClaudeBot, CCBot, Bytespider, SemrushBot, AhrefsBot) are rejected with 403 at the proxy edge. Btw, this check is easily sidestepped: adversarial scrapers that spoof their user agent will bypass this, which is why rate limiting is the primary defense.

    • Anonymous rate limiting -- unauthenticated traffic is capped at 30 requests/min per IP. Authenticated users (identified by session cookie or API token) are exempt. At 30 r/m, a human browsing casually has ample headroom; a bot hammering endpoints hits the ceiling immediately.

    To support this, I built a custom Caddy image with the rate limiting plugin baked in, pinned to v2.8.4 for reproducibility. The next stress test run will measure how much backend CPU drops as a result.

  • rk@tigase.net commented 2 weeks ago

    Layer 2 stress test results (Caddy rate limiting, 2026-03-14)

    Setup

    Same test as baseline: 50 VUs for 60s, 30 unauthenticated and 20 authenticated (DEVELOPER role). Rate limiting applied to anonymous traffic only (30 r/min per IP).

    Pod metrics (idle => under load)

    PodCPU idleCPU loadMemory idleMemory load
    sztab-backend2m174m443Mi542Mi
    sztab-db4m147m46Mi77Mi
    caddy1m102m12Mi17Mi
    sztabina1m1m2Mi2Mi

    Comparison vs baseline (Layer 1 only)

    PodLayer 1Layer 2Change
    sztab-backend370m174m-53%
    sztab-db137m147m~flat (noise)
    caddy117m102m-13%

    Observations

    • Backend CPU dropped by 53% -- anonymous bot traffic is now absorbed by Caddy before requests reach the JVM. The JVM no longer wakes up, allocates objects, or runs the filter chain for unauthenticated requests that exceed the rate limit.
    • DB CPU is flat -- authenticated queries still run as expected. The reduction in backend CPU is entirely from eliminating the unauthenticated filter chain overhead.
    • Caddy CPU is slightly lower too -- the rate limit decision short-circuits before the upstream proxy step, so Caddy does less work per rejected request than it did forwarding 403s from the backend.
    • Memory is stable across both scenarios -- no sign of heap pressure or GC storms under load.

    Next steps

    Layer 3 (robots.txt) and Layer 4 (permission-based access gating) to follow.

  • rk@tigase.net commented 2 weeks ago

    Layer 3 Bot mitigation using robots.txt.

    Test Results:

    rksuma@Ramakrishnans-MacBook-Pro sztab % 
    rksuma@Ramakrishnans-MacBook-Pro sztab % helm upgrade sztab deploy/helm/sztab -f deploy/helm/sztab/values-staging.yaml -n sztab-staging
    Release "sztab" has been upgraded. Happy Helming!
    NAME: sztab
    LAST DEPLOYED: Sun Mar 15 10:39:36 2026
    NAMESPACE: sztab-staging
    STATUS: deployed
    REVISION: 25
    TEST SUITE: None
    rksuma@Ramakrishnans-MacBook-Pro sztab % kubectl rollout restart deployment/caddy -n sztab-staging
    deployment.apps/caddy restarted
    Waiting for deployment "caddy" rollout to finish: 1 old replicas are pending termination...
    rksuma@Ramakrishnans-MacBook-Pro sztab %  kubectl rollout status deployment/caddy -n sztab-staging
    Waiting for deployment "caddy" rollout to finish: 1 old replicas are pending termination...
    deployment "caddy" successfully rolled out
    rksuma@Ramakrishnans-MacBook-Pro sztab % kubectl get pods -n sztab-staging -w                                                          
    NAME                            READY   STATUS    RESTARTS   AGE
    caddy-6fbc5697cd-ll92p          1/1     Running   0          13s
    sztab-backend-644c77d58-r46xd   1/1     Running   0          41h
    sztab-db-fb967c9d5-fs84w        1/1     Running   0          18d
    sztab-ui-57764ffc4f-r9hlg       1/1     Running   0          3d12h
    sztabina-65b5cff756-kzl4f       1/1     Running   0          42h
    ^C%                     
    
    
    ### **Verify Caddy serves the robots.txt and the sitemap.xml**:                                                                                                                                          rksuma@Ramakrishnans-MacBook-Pro sztab % curl -s http://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/robots.txt
    curl -s http://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/sitemap.xml
    User-agent: *
    Disallow: /api/
    Disallow: /git/
    
    User-agent: GPTBot
    Disallow: /
    
    User-agent: ClaudeBot
    Disallow: /
    
    User-agent: CCBot
    Disallow: /
    
    User-agent: Bytespider
    Disallow: /
    
    User-agent: Amazonbot
    Disallow: /
    
    User-agent: PetalBot
    Disallow: /
    
    Sitemap: https://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/sitemap.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
        <loc>https://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/</loc>
        <lastmod>2026-03-14</lastmod>
      </url>
      <url>
        <loc>https://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/docs</loc>
        <lastmod>2026-03-14</lastmod>
      </url>
    </urlset>%                                                                                                                                                        rksuma@Ramakrishnans-MacBook-Pro sztab % curl -s http://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/sitemap.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
        <loc>https://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/</loc>
        <lastmod>2026-03-14</lastmod>
      </url>
      <url>
        <loc>https://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/docs</loc>
        <lastmod>2026-03-14</lastmod>
      </url>
    </urlset>%                                                                                                                                                        rksuma@Ramakrishnans-MacBook-Pro sztab %   
    
    
  • rk@tigase.net commented 2 weeks ago

    Layer 4: Permission-based access gating — design rationale

    Rate limiting (Layer 2) stops anonymous bots. A determined attacker creates an account and bypasses it. Blocking by INTERNAL/EXTERNAL user type is also insufficient — attacks can come from compromised or low-privilege internal accounts.

    Layer 4 gates expensive endpoints (PR detail, diffs, branch list) by role:

    TierRolesAccess
    Light readOBSERVER, CUSTOMER_SUPPORTIssues list, basic project info
    Full readDEVELOPER, QA_ENGINEER, DOCUMENT_WRITER, UX_DESIGNER, SCRUM_MASTER+ PR detail, branch list, diffs
    WritePROJECT_MANAGER, RELEASE_MANAGER+ create/update
    AdminADMINEverything

    Implementation: extend ExternalUserPolicy with requireRole(auth, RoleName...) and apply it at the controller layer. Boundaries are a starting point — raise concerns on this ticket if adjustments are needed.

  • rk@tigase.net commented 2 weeks ago

    SZ-73 – Layer 4 Bot Mitigation Performance Experiment

    Objective

    Evaluate the performance impact of removing per-request database lookups for user and role resolution in the authorization policy layer.

    Previously, the security policy resolved the user type by querying the database via UserService.getUserByUsername() for each incoming request. Under bot load, this caused the backend to issue frequent database lookups despite the fact that the authenticated user's authorities are already present in the Spring Security Authentication object.

    The change introduced in this experiment eliminates the database lookup from the request hot path and instead relies solely on authorities stored in the SecurityContext.

    The goal is to verify the impact of this change under synthetic bot traffic.


    Change Introduced

    Previous behavior:

    request
      |
      v
    ExternalUserPolicy.resolveType()
      |
      v
    UserService.getUserByUsername()
      |
      v
    database lookup
    

    New behavior:

    request
      |
      v
    SecurityContextHolder
      |
      v
    Authentication.getAuthorities()
      |
      v
    policy enforcement (no DB access)
    

    The authorization aspects (RequireRoleAspect, RequireInternalAspect) now operate purely on the Authentication authorities.

    Image tested:

    tigase.dev/sztab/sztab-backend:sz73-bot-protection-v8
    

    Test Environment

    Cluster: k3s staging cluster
    namespace: sztab-staging

    Deployment topology:

    caddy
    sztab-backend (1 replica)
    sztab-db
    sztab-ui
    sztabina
    

    Load Generation

    Load was generated using the project k6 stress test script:

    scripts/stress-test/k6/run-stress-test.sh
    

    Configuration:

    50 virtual users  
    20 authenticated  
    30 unauthenticated  
    duration: 60 seconds
    

    Traffic profile targets endpoints typically accessed by bot crawlers.


    k6 Results

    http_req_duration:

    avg = 71.7 ms  
    min = 27.68 ms  
    median = 53.23 ms  
    max = 655.92 ms  
    p90 = 126.02 ms  
    p95 = 142 ms  
    

    http_req_failed:

    63.73% (11093 / 17405)
    

    http_reqs:

    17405 requests  
    ≈ 289 requests/sec
    

    *iteration_duration:

    avg = 172.51 ms  
    p95 = 243.16 ms
    

    Failure rate is expected because bot mitigation intentionally rejects a large fraction of unauthenticated traffic.


    Pod Metrics (Post-Stress)

    NAME            CPU       MEMORY
    --------------------------------
    caddy           105m      19Mi
    sztab-backend   503m      452Mi
    sztab-db        126m      74Mi
    sztab-ui        1m        3Mi
    sztabina        1m        2Mi
    

    Comparison with Previous Run

    Previous experiment (before removing DB lookups):

    sztab-backend CPU ≈ 808m
    sztab-db CPU ≈ 165m

    After optimization:

    sztab-backend CPU ≈ 503m
    sztab-db CPU ≈ 126m

    Observed improvements:

    backend CPU reduction ≈ 37–40%
    database CPU reduction ≈ 24%

    Latency also improved:

    previous avg latency ≈ 175 ms
    new avg latency ≈ 71 ms

    Note: the latency comparison should be considered indicative rather than strictly controlled. The earlier measurement and this run were conducted under slightly different runtime conditions, and the earlier measurement included the overhead of the per-request database lookup. While the improvement is directionally consistent with the removal of that lookup, the latency values should not be interpreted as a controlled A/B benchmark.


    Interpretation

    The previous design performed a database lookup on each request to determine user type. Under bot traffic (~290 req/s), this resulted in frequent database access for information that was already present in the authenticated security context.

    By removing the database dependency from the hot path and relying on Authentication.getAuthorities() instead, the system now performs role checks purely in memory.

    This change produced measurable improvements in:

    • backend CPU utilization
    • database load
    • request latency

    Importantly, bot mitigation behavior remained unchanged.


    Conclusion

    Removing per-request user lookups from the authorization policy significantly improved system efficiency under bot traffic.

    At approximately 290 requests/sec:

    backend CPU dropped from ~808m to ~503m
    database CPU dropped from ~165m to ~126m
    average request latency dropped from ~175ms to ~71ms

    This confirms that the security policy layer should operate exclusively on data already present in the SecurityContext and avoid database access in the request hot path.

    This optimization improves the resilience of the system when subjected to high volumes of bot or crawler traffic.

  • rk@tigase.net commented 2 weeks ago

    Next: Git-level bot mitigation

    The REST API and proxy layers are now protected. The remaining attack surface is the Git endpoint (/git/*), which is proxied directly to Sztabina.

    A determined bot that obtains a valid PAT (or leverages anonymous access on a public project) can repeatedly issue clone, fetch, and diff operations. These are significantly more expensive than REST requests, as they trigger disk I/O and traversal of the git object graph.

    This makes the Git surface a high-cost amplification vector compared to the API layer.

    Planned mitigations:

    • Rate limiting on /git/* at the Caddy layer, independent from REST limits.
      Git operations are more expensive, so thresholds should be lower (e.g. 5–10 requests/min per IP).

    • PAT-scoped rate limiting to track usage per token rather than per IP.
      This helps mitigate bots that rotate source IPs while reusing credentials.

    • Sztabina-level request budgeting to enforce limits within the service itself, ensuring protection even if edge-layer controls are bypassed or misconfigured.

    Before implementing these controls, SZ-78 will establish a baseline by stress testing git diff and related endpoints. This will measure CPU and I/O impact on Sztabina under load, using the same methodology previously applied to the REST API layer.

  • rk@tigase.net commented 2 weeks ago
    rksuma@Ramakrishnans-MacBook-Pro sztab % git checkout wolnosc
    
    Already on 'wolnosc'
    Your branch is up to date with 'origin/wolnosc'.
    rksuma@Ramakrishnans-MacBook-Pro sztab % git pull origin wolnosc
    
    From https://tigase.dev/sztab
     * branch            wolnosc    -> FETCH_HEAD
    Already up to date.
    rksuma@Ramakrishnans-MacBook-Pro sztab % git checkout -b feature/SZ-73-git-bot-mitigation
    Switched to a new branch 'feature/SZ-73-git-bot-mitigation'
    rksuma@Ramakrishnans-MacBook-Pro sztab % 
    
    
  • rk@tigase.net commented 2 weeks ago

    SZ-73 Work Log

    Summary

    Implemented a four-layer bot mitigation strategy across the HTTP stack (Spring Security, Caddy edge controls, crawler directives, and AOP-based authorization). Established load-testing infrastructure (k6) and baseline measurements to quantify impact. Identified Git endpoints as the remaining high-cost attack surface, to be addressed next.

    Total effort: ~27h


    SZ-77 (blocker, fixed first)

    Bug: ProjectService infers repo type from gitUrl presence instead of repoType field

    • Identified root cause: isExternalRepo = gitUrl != null ignored repoType field entirely
    • Added RepoType parameter to ProjectService.createProject() interface and impl
    • Updated ProjectController to pass dto.effectiveRepoType()
    • Removed deprecated createProject(Project) overload
    • Updated tests — 297 passing
    • Branch: bugfix/SZ-77-repoType-inference → merged to wolnosc
    • Estimate: 2h

    SZ-73 Layer 1: Spring Security audit

    • Confirmed .anyRequest().authenticated() already in place
    • Identified actuator and Swagger exposure as follow-up items
    • Estimate: 0.5h

    SZ-73 Layer 2: Caddy rate limiting and UA blocklist

    Custom Caddy image

    • Wrote deploy/helm/sztab/caddy/Dockerfile with xcaddy + caddy-ratelimit plugin
    • Pinned to caddy:2.8.4, added build-time module verification
    • Built multi-platform image (linux/amd64, linux/arm64)
    • Updated values.yaml, values-staging.yaml, Helm template for new image
    • Fixed imagePullPolicy: Always in Helm template to avoid stale image cache

    Caddyfile

    • Added @ai_bots UA blocklist (GPTBot, ClaudeBot, CCBot, Bytespider, SemrushBot, AhrefsBot, Amazonbot, PetalBot)
    • Added @anonymous rate limit zone: 30 r/min per {remote_ip}
    • Used header_regexp for JSESSIONID matching (handles multiple cookies correctly)
    • Moved Caddyfile to deploy/helm/sztab/caddy/Caddyfile
    • Updated Helm ConfigMap template path
    • Updated build-release.sh Caddyfile source path

    Staging deployment issues resolved

    • k3s image cache — added imagePullPolicy: Always
    • ConfigMap empty — fixed .Files.Get path in Helm template
    • Multiple Caddy CrashLoopBackOff cycles debugged

    Estimate: 6h


    SZ-73 Layer 3: robots.txt and sitemap.xml

    • Added handle /robots.txt directly in Caddyfile with per-agent rules
    • Added handle /sitemap.xml with {env.SZTAB_DOMAIN} placeholder
    • Used SZTAB_DOMAIN env var (already wired from sztab.domain in Helm values)
    • Verified both endpoints return correct domain substitution on staging
    • Updated docker-compose.yml with Caddyfile path and custom image
    • Estimate: 1.5h

    Load testing infrastructure

    k6 script (bot-stress-test.ts)

    • Two scenarios: unauthenticated_bots (30 VUs), authenticated_bots (20 VUs)
    • Unauthenticated: hits public project/issue/PR list endpoints
    • Authenticated: hits issues, PR detail, branch list with bot session cookie
    • Named scenario exports with exec field
    • Fixed endpoint bugs: api/projects/{id}/issuesapi/issues?projectName=, branches path

    Runner script (run-stress-test.sh)

    • Mac-compatible curl_api helper (no head -n -1)
    • Admin login → fetch user ID → create bot user → assign DEVELOPER role → bot login
    • Project creation (repoType: LOCAL) → issue → PR
    • kubectl top before and after k6 run
    • Teardown: delete all PRs by project → delete feature branch → delete project → delete bot user
    • Multiple teardown fixes: PR FK constraint, branch FK constraint, bulk PR deletion

    Debugging cycles

    • Mac shell incompatibilities (head -n -1, local -n nameref)
    • Sztabina 409 handling in SztabinaClient.createRepository()
    • Sztabina returning text/plain → fixed in Go handler with util.EncodeJSON
    • RepositoryResponse NPE on null return from 409 handler
    • k6 named scenario exec field missing
    • Wrong issues endpoint, branch endpoint literal not interpolated

    Estimate: 8h


    Baseline measurements

    LayerBackend CPUDB CPU
    Idle2m4m
    Layer 1 only370m137m
    Layer 2 (Caddy RL)174m137m
    Layer 4 v1 (DB lookup)808m165m
    Layer 4 v2 (auth cache)503m126m

    Estimate: 1.5h


    SZ-73 Layer 4: Permission-based access gating

    Design

    • Defined two access tiers: LIGHT_READ (all roles) and FULL_READ (DEVELOPER+)
    • Decided against UserType as primary boundary — insider threat applies equally

    Implementation

    • AccessTier enum in com.sztab.policy.security.enums
    • @RequireRole(AccessTier) annotation in com.sztab.annotations.security
    • @RequireInternal annotation in com.sztab.annotations.security
    • RequireRoleAspect and RequireInternalAspect in com.sztab.policy.security.aspect
    • Pre-allocated RoleName[] arrays in aspect (no per-request allocation)
    • Defensive auth null check in both aspects
    • Applied annotations across IssueController, BranchController, PullRequestController
    • Fixed User.hasRole() bug: enum vs String comparison always returned false
    • Added UserTest with guard comment explaining the trap

    Performance optimization

    • Initial implementation caused extra getUserByUsername() DB call per request → 808m CPU
    • Fixed: resolved roles from Authentication.getAuthorities() — no DB access in hot path
    • Updated CustomUserDetailsService to include USERTYPE_INTERNAL/EXTERNAL as authority
    • Updated ExternalUserPolicy to use authorities — removed UserService dependency

    Estimate: 6h


    Documentation

    • Ticket comments: baseline results, Layer 2 results, Layer 4 results, design rationale
    • Team update sent to Artur
    • Work log

    Estimate: 1.5h


    Total

    AreaHours
    SZ-77 blocker fix2h
    Layer 1 audit0.5h
    Layer 2 (Caddy image + Caddyfile)6h
    Layer 3 (robots.txt + sitemap.xml)1.5h
    Load testing infrastructure8h
    Baseline measurements + analysis1.5h
    Layer 4 (AOP role gating + perf optimization)6h
    Documentation1.5h
    Total27h
  • rk@tigase.net commented 2 weeks ago

    A new problem seen when subjecting Sztab to large repos.

    Sztab and Sztabina are interfaced using Spring WebFlux.

     { Sztabina }  ==> { Sztab }
    

    When computing diff, Sztabina pipes the computed diff to Sztab thru WebFlux buffer. By default the buffer size is 256Kb. This was insufficient for the test I ran (I created a large repo).

    This caused the default WebFlux buffer to overflow; when this happens, Sztabina is unaware of the issue and shows no error in logs. It's the consumer (Sztab) that fails but even there the exception happens inside SPring Flux leading to a cryptic exception in the logs:

    rksuma@Ramakrishnans-MacBook-Pro sztab %
    rksuma@Ramakrishnans-MacBook-Pro sztab % kubectl logs -n sztab-staging deployment/sztab-backend --tail=200 | grep -B2 -A10 "diff-by-url" | head -40
    Defaulted container "sztab-backend" out of: sztab-backend, wait-for-db (init)
            Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
    Error has been observed at the following site(s):
            *__checkpoint ⇢ Body from POST http://sztabina:8085/repos/compare/diff-by-url [DefaultClientResponse]
    Original Stack Trace:
                    at org.springframework.core.io.buffer.LimitedDataBufferList.raiseLimitException(LimitedDataBufferList.java:99) ~[spring-core-6.1.13.jar!/:6.1.13]
                    at org.springframework.core.io.buffer.LimitedDataBufferList.updateCount(LimitedDataBufferList.java:92) ~[spring-core-6.1.13.jar!/:6.1.13]
                    at org.springframework.core.io.buffer.LimitedDataBufferList.add(LimitedDataBufferList.java:58) ~[spring-core-6.1.13.jar!/:6.1.13]
                    at reactor.core.publisher.MonoCollect$CollectSubscriber.onNext(MonoCollect.java:103) ~[reactor-core-3.6.10.jar!/:3.6.10]
                    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122) ~[reactor-core-3.6.10.jar!/:3.6.10]
                    at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext(FluxPeek.java:200) ~[reactor-core-3.6.10.jar!/:3.6.10]
                    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122) ~[reactor-core-3.6.10.jar!/:3.6.10]
                    at reactor.netty.channel.FluxReceive.onInboundNext(FluxReceive.java:379) ~[reactor-netty-core-1.1.22.jar!/:1.1.22]
                    at reactor.netty.channel.ChannelOperations.onInboundNext(ChannelOperations.java:425) ~[reactor-netty-core-1.1.22.jar!/:1.1.22]
    

    FIx:

    Increase the WebFlux cache size is 16MB to support large diffs.

    We can't fully know in advance because diff size depends on user content. The right approach is a generous configurable default, a specific caught exception with a clear error message, and monitoring.

  • Artur Hefczyc commented 2 weeks ago

    Good catch. Yes, this is exactly why I am in favor of running tests against real world data. Testing on our largest repos is a good start but maybe this is not enough. How about testing the system against really big repos available out there. Like Linux repo?

    I mean, how do you know that the cache size 16MB is enough? Is there a limit to what we can handle?

  • rk@tigase.net referenced from other issue 2 weeks ago
  • rk@tigase.net commented 2 weeks ago

    Good catch. Yes, this is exactly why I am in favor of running tests against real world data. Testing on our largest repos is a good start but maybe this is not enough. How about testing the system against really big repos available out there. Like Linux repo?

    I mean, how do you know that the cache size 16MB is enough? Is there a limit to what we can handle?

    16MB covers the overwhelming majority of real-world engineering team PRs based on first-principles sizing (500 files × 200 lines × 2 sides × 100 bytes ≈ 20MB worst case). It's a pragmatic default for the target audience.

    But for Linux-scale repos the answer is not a bigger buffer — it's streaming.

    bodyToMono() forces full in-memory buffering by design. The correct fix at that scale is to stream the diff response directly from Sztabina to the client using bodyToFlux() or server-sent events, bypassing the buffer entirely. That's a more significant architectural change and I am tracking that as a future improvement.

    Testing against real large repos like Linux is a good idea for stress testing the git engine (SZ-78), but it will surface the streaming limitation as much as the buffer size. I have opened https://tigase.dev/sztab/~issues/120 to track this.

  • rk@tigase.net referenced from other issue 2 weeks ago
  • rk@tigase.net commented 2 weeks ago

    SZ-78 Baseline: Sztabina diff endpoint under bot load (2026-03-17)

    Setup

    50 VUs for 60s — 30 unauthenticated (anonymous bot simulation) and 20 authenticated (DEVELOPER role). Authenticated scenario includes GET /api/pullrequests/29/detail which triggers a real git diff computation in Sztabina against the TESTSZTAB repo (20 files, ~276KB unified diff).

    Pod metrics (idle → under load)

    PodCPU idleCPU loadMemory idleMemory load
    sztab-backend2m294m443Mi462Mi
    sztab-db4m36m46Mi75Mi
    caddy1m66m12Mi19Mi
    sztabina1m497m2Mi35Mi

    Observations

    • Sztabina is now the bottleneck — 497m CPU under load, exceeding the backend (294m). Previous runs showed Sztabina at 1m because no real diff work was being done. This confirms git diff computation is CPU-intensive.

    • DB CPU dropped to 36m — down from 126m in the Layer 4 baseline. Role lookups are now resolved from Spring Security authorities (SZ-79 fix), eliminating per-request DB hits.

    • 106MB total data received during test run — confirms that diff payloads are flowing end-to-end and the 16MB buffer fix is not prematurely truncating responses.

    • p95 authenticated latency: 2.3s This reflects CPU-bound git diff computation under 20 concurrent requests. Latency is expected to scale with diff size and concurrency; mitigation should focus on limiting concurrent diff execution rather than optimizing JVM paths.

    • 86.5% request failure rate — expected and desired. The majority of unauthenticated bot traffic is intentionally rejected (429 rate limiting at Caddy, 403 at Spring Security). This indicates mitigation layers are actively protecting backend resources.

    Comparison with previous baselines

    ScenarioBackend CPUDB CPUSztabina CPUNotes
    Layer 1 only370m137m1mNo real diff work
    Layer 2 (Caddy RL)174m137m1mNo real diff work
    Layer 4 (auth cache)503m126m1mNo real diff work
    SZ-78 (real diffs)294m36m497mReal git diff load

    Key finding

    Git diff computation is CPU-bound and shifts the system bottleneck from the JVM to Sztabina. Under concurrent load, Sztabina saturates (~500m CPU) before backend or DB resources become constrained.

    This establishes git diff execution as the dominant cost center in the system and justifies prioritizing rate limiting and concurrency control for diff endpoints.

    🔴 Important: The system is not I/O-bound or DB-bound under load; it is compute-bound on git operations. All further scaling and mitigation decisions should be evaluated against this constraint.

    Next steps

    • Implement git-level rate limiting at Caddy (/git/* endpoints)
    • Test against larger diffs (Linux kernel scale) per SZ-79 task
    • Monitor Sztabina CPU in production — consider horizontal scaling if diff load grows beyond single-pod capacity
  • rk@tigase.net commented 2 weeks ago

    Git endpoint rate limiting policy

    Git operations (clone, fetch, diff) are significantly more expensive than REST requests — each one triggers disk I/O and git object graph traversal in Sztabina. Unlike REST endpoints, git operations are stateless from the client's perspective, meaning a bot with a valid PAT can hammer the same repo repeatedly without any server-side memory of prior requests.

    Two rate limit zones are applied at the Caddy layer:

    • Anonymous git (no Authorization header): 5 requests/min per IP. Public repo access is permitted but tightly throttled. A legitimate user cloning a repo once is unaffected; a crawler hitting the endpoint repeatedly is blocked immediately.

    • Authenticated git (valid PAT): 30 requests/min per IP. Generous enough for CI pipelines and active developer workflows, tight enough to prevent a compromised or bot-controlled PAT from saturating Sztabina under repeated clone/fetch load.

    PAT authentication proves identity but does not limit volume. Rate limiting at the proxy layer is the correct control for volume — it applies regardless of whether the requestor is human or automated, internal or external.

  • rk@tigase.net commented 2 weeks ago

    Git rate limiting directives in Caddy:

    # ------------------------------
        # Rate limiting — anonymous REST traffic only (Layer 2b)
        # Authenticated requests (JSESSIONID cookie or Authorization header)
        # bypass this — they are governed by Layer 1 (Spring Security) and
        # Layer 4 (permission-based access).
        # 30 events/min per IP gives legitimate anonymous browsers ample
        # headroom while decisively blocking crawlers.
        # {remote_ip} is used as the rate limit key — cheaper than {remote_host}
        # which would trigger a reverse DNS lookup on every request.
        #
        # NOTE: /git/* is explicitly excluded here to ensure git traffic is
        # governed solely by the git-specific rate limit zones below.
        # This makes the zones mutually exclusive and order-independent.
        # ------------------------------
        @anonymous {
            not path /git/*
            not header_regexp Cookie JSESSIONID
            not header Authorization *
        }
        rate_limit @anonymous {
            zone anonymous_zone {
                key    {remote_ip}
                events 30
                window 1m
            }
        }
    
        # ------------------------------
        # Rate limiting — anonymous git traffic (Layer 2c)
        # Git operations (clone, fetch, diff) are significantly more expensive
        # than REST requests — each triggers disk I/O and git object graph
        # traversal in Sztabina. Anonymous access to public repos is permitted
        # but tightly throttled.
        #
        # 10 r/min accounts for git clone burst behavior — a single clone
        # generates multiple HTTP requests (info/refs, pack negotiation, object
        # fetch). 5 r/min was too tight; 10 r/min blocks sustained crawling
        # while allowing legitimate one-time clones.
        #
        # Rate limit key is {remote_ip}{path} — scoped per IP per repository.
        # Git cost is repo-specific: cloning repo A is independent of cloning
        # repo B. A CI pipeline cloning multiple repos is not penalized the
        # same as a bot hammering a single repo repeatedly.
        #
        # IP-only key for anonymous traffic — no token available.
        # NAT/corporate network tradeoff accepted: anonymous git from a shared
        # IP is already a suspicious pattern.
        # ------------------------------
        @git_anonymous {
            path /git/*
            not header Authorization *
        }
        rate_limit @git_anonymous {
            zone git_anonymous_zone {
                key    {remote_ip}{path}
                events 10
                window 1m
            }
        }
    
        # ------------------------------
        # Rate limiting — authenticated git traffic (Layer 2d)
        # PAT authentication proves identity but does not limit volume.
        # A compromised or bot-controlled PAT can saturate Sztabina with
        # repeated clone/fetch operations. 30 r/min per IP per repo is
        # generous enough for CI pipelines and active developer workflows
        # while blocking bots.
        #
        # Rate limit key is {remote_ip}{path} — scoped per IP per repository.
        # A developer or CI pipeline working across multiple repos is not
        # penalized; a bot hammering a single repo is throttled.
        #
        # NOTE: Authorization header value (Base64-encoded Basic credentials)
        # is intentionally NOT used as part of the key. The same credentials
        # can produce different encodings across clients (whitespace, padding
        # variations), making the raw header an unreliable key. IP + path
        # is simpler, stable, and matches the actual cost model.
        #
        # Git HTTP protocol uses Authorization headers (Basic/PAT).
        # Session cookies (JSESSIONID) are not used by git clients and
        # are intentionally ignored here to avoid incorrect classification.
        # ------------------------------
        @git_authenticated {
            path /git/*
            header Authorization *
        }
        rate_limit @git_authenticated {
            zone git_authenticated_zone {
                key    {remote_ip}{path}
                events 30
                window 1m
            }
        }
    

    Validation

    Rate limiting can be verified manually by issuing repeated requests to the git info/refs endpoint and observing HTTP 429 responses once the configured thresholds are exceeded.

    In addition, the existing k6 stress test will be extended to include git endpoints. Validation criteria:

    • Anonymous git traffic is throttled at ~10 r/min per IP per path
    • Authenticated git traffic is throttled at ~30 r/min per IP per path
    • Legitimate single clone/fetch operations complete without 429
    • Sztabina CPU usage decreases under bot load compared to baseline
  • rk@tigase.net commented 2 weeks ago

    Test git Rate Limiting: Unauthenticated Git

    rksuma@Ramakrishnans-MacBook-Pro sztab % for i in $(seq 1 15); do
      STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
        "http://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/git/TESTSZTAB.git/info/refs?service=git-upload-pack")
      echo "Request $i: $STATUS"
    done
    Request 1: 200
    Request 2: 200
    Request 3: 200
    Request 4: 200
    Request 5: 200
    Request 6: 200
    Request 7: 200
    Request 8: 200
    Request 9: 200
    Request 10: 200
    Request 11: 429
    Request 12: 429
    Request 13: 429
    Request 14: 429
    Request 15: 429
    rksuma@Ramakrishnans-MacBook-Pro sztab % 
    

    Test git Rate Limiting: Authenticated Git

    rksuma@Ramakrishnans-MacBook-Pro sztab % for i in $(seq 1 35); do
      STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
        -H "Authorization: Basic $(echo -n 'admin:szt_6G9__eAmsumQr2C79F9c5ScAl5NkgwIySshIPE7v' | base64)" \
        "http://ec2-35-87-145-56.us-west-2.compute.amazonaws.com/git/TESTSZTAB.git/info/refs?service=git-upload-pack")
      echo "Request $i: $STATUS"
    done
    Request 1: 200
    Request 2: 200
    Request 3: 200
    Request 4: 200
    Request 5: 200
    Request 6: 200
    Request 7: 200
    Request 8: 200
    Request 9: 200
    Request 10: 200
    Request 11: 200
    Request 12: 200
    Request 13: 200
    Request 14: 200
    Request 15: 200
    Request 16: 200
    Request 17: 200
    Request 18: 200
    Request 19: 200
    Request 20: 200
    Request 21: 200
    Request 22: 200
    Request 23: 200
    Request 24: 200
    Request 25: 200
    Request 26: 200
    Request 27: 200
    Request 28: 200
    Request 29: 200
    Request 30: 200
    Request 31: 429
    Request 32: 429
    Request 33: 429
    Request 34: 429
    Request 35: 429
    rksuma@Ramakrishnans-MacBook-Pro sztab % 
    
    

    Summary:

    - 30 consecutive requests → HTTP 200
    - 31st request onward → HTTP 429
    

    Confirms:

    - authenticated git limit enforced at configured threshold
    - Authorization-based classification working correctly
    - no overlap with anonymous rate limit zone
    
  • rk@tigase.net commented 2 weeks ago
  • rk@tigase.net changed state to 'Pending approval' 2 weeks ago
    Previous Value Current Value
    In Progress
    Pending approval
  • rk@tigase.net referenced from other issue 2 weeks ago
  • rk@tigase.net commented 2 weeks ago

    Merged the changes into wolsonsc.

  • rk@tigase.net changed state to 'Closed' 2 weeks ago
    Previous Value Current Value
    Pending approval
    Closed
issue 1 of 1
Type
New Feature
Priority
Normal
Assignee
Version
none
Sprints
n/a
Customer
n/a
Issue Votes (0)
Watchers (3)
Reference
SZ-73
Please wait...
Page is in error, reload to recover