DogPack
Confidential  ·  Executive Briefing  ·  May 2026
NineHertz

H2 2026 — Strategic Overview

Incidents & Action Plan  ·  Scaling H2 2026  ·  AWS & Cloud Security  ·  App Security  ·  Delivery & Team Excellence

Where We Stand — May 2026

An honest, forward-looking update for leadership. Click “View details” on any section to see the full breakdown.

We Were Hit — And We Are Better for It
View details ›

Two incidents in May 2026 exposed gaps we might never have found otherwise. We moved quickly — three critical fixes are already live, two more ship by June 8. Our platform is now more secure than it has ever been, and we have a clear plan to keep it that way.

3 Fixes Live 2 Remaining — Jun 8
Our Growth Demands a Bigger Engine — We Are Designing It
View details ›

2.5 million users. 200,000 new in just three months. We are analysing each critical area of the platform — feed, search, database, video — reviewing the current design and identifying the right approaches to make each one scalable for the next stage of growth.

Analysis & Design Underway
Cloud Security Is Being Wired Into the Foundation, Not Bolted On
View details ›

The incidents revealed our cloud security posture needed a serious upgrade. We identified every gap and are closing them systematically. Many are already done — every remaining item has an owner and a deadline.

Active Remediation in Progress
Application Security — Reviewing and Strengthening Every Layer
View details ›

Beyond the incidents, we are reviewing our application-level security practices — how data is protected, how sessions are managed, how files are handled, and how access is controlled. Some areas are already solid; others are being reinforced right now.

Review In Progress
How We Propose to Build and Deliver Going Forward
View details ›

We are recommending a unified engineering framework for how both Canada and India teams plan, build, and ship — with clear quality standards, documented decisions, and automation in every sprint. We believe this is the right operating model for DogPack’s scale, and we are proposing it for adoption.

Framework Proposed — Pending Alignment
Critical Infrastructure Incidents & Action Plan

Two critical production incidents occurred in May 2026. Remediation is actively underway. Immediate management attention required on all open items.

🚨
Incident 1 — Security Breach — Media Content Downloaded by Attackers

Attackers used an automated script to exploit raw CloudFront URLs hardcoded in the application, downloading a large volume of media content while bypassing CloudFlare security controls.

Root Cause
  • Internal CloudFront distribution URLs were directly embedded in the codebase instead of routing through CDN or custom domain
  • These URLs bypassed CloudFlare WAF, DDoS protection, and rate limiting entirely
  • No expiry or signing mechanism was in place — captured URLs remained permanently valid
5-Point Remediation Plan
Code Audit & Route via CloudFlare
Fix 01
Done
  • Full codebase scanned — all raw CloudFront URLs identified and replaced
  • All media requests now routed via CloudFlare / CDN custom domain
  • Zero internal media URLs exposed in production
Pre-Signed URLs — Temporary Access
Fix 02
Done
  • All media now served via pre-signed, time-limited URLs
  • Each URL has an expiry timestamp and cryptographic signature
  • Previously captured URLs are now expired and useless
WAF & Rate Limiting
Fix 03
Done
  • CloudFlare rate limiting enforced at edge
  • CloudFront WAF deployed as secondary defense layer
  • Bulk script-based downloads now automatically blocked
User Agent & Header Validation
Fix 04
Analysis In Progress
  • Analysis underway via CloudFlare to allow only valid requests with recognised user agents and headers
  • Requests from unknown or invalid clients to be rejected before reaching origin
  • ETA June 8, 2026
CloudFront Rename
Fix 05
Analysis In Progress
  • Renaming CloudFront distribution so any previously captured URLs return 404
  • Discussion needed — SEO and WordPress dependency impact to be assessed before proceeding
  • ETA June 8, 2026
🔴
Incident 2 — Database Outage: MySQL Read Replica Unresponsive

One read replica became completely unresponsive, causing all read replicas to be blocked and data load failures for users. Manual restart was required. No confirmed single root cause — multiple workloads were running in parallel at the time.

Contributing Factors (All Simultaneous)
📈 Application Traffic — Normal user-facing read load on the replica
🤖 Google Bot Crawling — Aggressive indexing amplifying read queries significantly
🔄 Shopify Sync — Background sync jobs competing for CPU on the same instance
Result: All three workloads running simultaneously — one read replica became unresponsive and all read replicas were blocked, requiring manual restart
Action Plan & Current Status
Action Item Status Notes
Database Instance Analysis Done Analysis complete — instance sizing and capacity reviewed
Recommendation: Upgrade to a higher instance type to handle combined peak workloads
MySQL Instance Upgrade Pending Upgrade read replica to handle combined peak workloads
Prevents recurrence under concurrent load scenarios
Google Bot — robots.txt Config In Progress Configuring crawl rate limits and restricting unnecessary page indexing
Reduces DB load from aggressive bot traffic
Feed Architecture Review Done Complex feed logic must be offloaded from MySQL
Move ranking and discovery to Elastic / OpenSearch
Offload Feed to Search DB (OpenSearch) Feasibility Analysis Evaluating feasibility of migrating feed ranking and query logic to OpenSearch
Would eliminate expensive wildcard queries on the primary MySQL replica
Last 30-Day Data in Cache Feasibility Analysis Evaluating feasibility of keeping last 30 days of feed data in a fast cache layer
Goal: reduce repetitive read load on MySQL for recent content
✓ Observability — App-Side Logs Enabled (May 21)
Scaling H2 2026 — Architecture Review

With rapid user growth, a full architecture review is underway across all critical features — not limited to the database. The goal is to ensure every system layer can support concurrent load, eliminate single points of failure, and sustain growth beyond current scale.

Architecture Review of Critical Features — In Progress   A comprehensive review is being conducted across feed loading, search, database, real-time calculations, and observability to identify bottlenecks and improvement scope at scale.
Improvement Scope
1. Feed Load Optimisation
Analysis In Progress
  • Current: Feed calculated on-demand per request — slow under concurrent load
  • Improvement: Analysis in progress to use a dedicated search database and caching layer for faster feed loading and processing
  • Goal: significantly reduce MySQL load and improve response time at scale
2. Video Streaming Quality
Solution Planned
  • Current: Single resolution (720p only), heavy compression, 30 sec–2 min latency before video is viewable
  • Solution: Adaptive video streaming with multi-resolution output (480p, 720p, 1080p) and adaptive bitrate player
  • 3-phase roadmap prepared — pending approval to proceed
3. Search Optimisation
Feasibility Analysis
  • Current: User search and dog search running expensive wildcard queries directly on MySQL
  • Improvement: Move user search and dog search to a dedicated search database (OpenSearch)
  • Offloads heavy search load from MySQL — faster results and reduced DB pressure
4. Database Optimisation
Analysis In Progress
  • Translations logic reviewed — scalable logic finalised, POC pending
  • Database instance upgrade — analysis in progress, discussion needed on instance sizing
  • Reviewing overall DB architecture to support higher concurrent connections
5. Follow / Unfollow Performance
Analysis Pending
  • Performance issue identified with follow and unfollow operations
  • Both database and code-level analysis pending to identify root cause
  • Evaluating whether a graph database approach is suitable for follower relationships at scale
6. Load Testing & Concurrency Validation
Planned
  • Validate application performance across critical flows under 500, 800, and 1,000 concurrent users
  • Run tests against current fixes already deployed to confirm stability, then re-test as each optimisation lands
  • Feed, search, and DB improvements will significantly offload MySQL — load tests will confirm the compounded improvement in response time and concurrency capacity
  • Identify any remaining bottlenecks before they become production incidents
Tentative Delivery Timeline

All dates are indicative and subject to change. This chart is directional only — not a committed roadmap.

Analysis / POC Development Testing & Delivery Recurring
JUN
JUL
AUG
SEP
1. Feed Load Optimisation
POC
Development
Testing
2. Video Streaming Quality
Ph 1 Dev
Testing
Ph 2 Dev
3. Search Optimisation
Development
Testing / Delivery
4. Database Optimisation
Ph 1 — Upgrade
Validate & Delivery
5. Follow / Unfollow Performance
Analysis
Development
Testing

⚠ All timelines are tentative. Actual delivery depends on feasibility outcomes, team capacity, and business priorities. Confirmed sprint by sprint.

How These Changes Work Together
Combined DB offload impact: Feed load optimisation (OpenSearch + Redis), search migration (OpenSearch), and caching improvements will collectively and significantly reduce read load on MySQL — making the database faster and more resilient for all remaining operations. Each improvement compounds the benefit of the others.
Target State
Application Performance
  • Working on all the above items will drastically improve concurrent user capacity and overall app performance
  • Feed, search, and follow operations become fast and predictable regardless of load
  • Load testing will validate and quantify the improvement at each phase
Infrastructure Resilience
  • No single point of failure — load spread across purpose-built systems (MySQL, OpenSearch, Redis)
  • MySQL relieved of read-heavy workloads — reserved for writes and critical operations only
  • Architecture built to sustain growth well beyond current user base
AWS and Cloud Security

Comprehensive security controls across IAM, network hardening, VM patching, threat detection, and post-incident remediation across all AWS workloads.

IAM and MFA Enforcement
In Progress
  • MFA enforced for all users on AWS Console — Done
  • Root account protected; never used for daily operations
  • Regular IAM access key rotation — In Progress
VM Patching and OS Hardening
In Progress
  • OS patching managed via AWS Patch Manager
  • Ubuntu instance patching — Pending
  • SSH root login — already disabled
  • All RDP and SSH from internet — already disabled
Threat Detection and Audit Logging
Updated Post-Incident
  • Amazon GuardDuty — continuous ML-based threat detection, Done
  • AWS CloudTrail — full API and access audit logging, Done
  • Amazon Inspector — cost analysis in progress (EC2 scan pricing under review)
  • CloudWatch anomaly alerts and WAF block log review — monthly cadence, In Progress
WAF, CloudFront and DDoS Protection
Updated Post-Incident
  • AWS WAF attached to CloudFront with rate limiting and managed rule sets — In Progress
  • Lambda-based automated monitoring active — detects and blocks suspicious IPs, Done
  • Origin Access Control (OAC) configured — Done
  • WAF and security rules cleanup — In Progress
  • CloudFlare proxy — In Progress
Network and Access Hardening
In Progress
  • EC2 Security Groups — removing all open internet-facing ports, In Progress
  • Block all public S3 access — In Progress
  • Database instances in private subnets, no public endpoint — In Progress
  • Public RDS access restricted — In Progress
  • ALB access restrictions — In Progress
  • Public instance migration to private — Pending
  • Development domains restricted to office IP / VPN only — In Progress
Bot and Traffic Controls
Already Present
  • Google Crawler Bot rule — already present
  • cURL block at domain level — already present
  • cURL block at application endpoint — In Progress
AWS Tooling Under Review
  • AWS Trusted Advisor — analysing the tool, may require a higher AWS support plan
  • Automated misconfiguration scanning — Pending
  • Publicly exposed Kafka — In Progress
Security Task Status
Task Status
Pre-Signed URL enforcement on mediaPending
Block all public S3 accessIn Progress
Attach WAF to CloudFront with rate limitingIn Progress
Publicly Exposed KafkaIn Progress
Database instances in private subnets, no public endpointIn Progress
Trusted Advisor and Config misconfig scanningPending
Monthly CloudWatch anomaly and WAF block log reviewIn Progress
WAF and Security Rules CleanupIn Progress
cURL Block on Application EndpointIn Progress
CloudFlare ProxyIn Progress
Restrict dev domains to office IP / VPNIn Progress
Public Instance migration to PrivatePending
ALB Access RestrictionsIn Progress
Application Security Best Practices

Industry-standard guidelines for all AWS-hosted applications across 10 core domains, based on OWASP Top 10, AWS Security Reference Architecture, NIST SP 800-53, and CIS Benchmarks.

OSS Governance and Approval
Package Analysis In Progress
  • All packages under review — Legal and Compliance approval process being established
  • Tooling: FOSSA for license compliance scanning across all dependencies
  • Restrict AGPL and commercial dual-license components from production use
Session Management
Already Implemented
  • Cryptographically random tokens — 128-bit entropy minimum
  • Server-side invalidation on logout — clearing cookie alone is insufficient
Data Protection and Encryption
  • AES-256 encryption at rest for all sensitive data
  • TLS 1.2 minimum for all data in transit
  • Credentials never hardcoded — rotate secrets on a defined schedule
Secure File Uploads
  • File scanning — Pending
  • MIME type and extension allowlisting — Already Implemented
XSS and Input Validation
Implemented — Detailed Review In Progress
  • Validation implemented at entry points — currently being validated in detail to ensure full coverage
  • Output encoding at render time — both layers required
  • CSP, HSTS, X-Frame-Options, CORS on every HTTP response
Data Access Control
Review In Progress
  • Access control checks are present in the codebase
  • Reviewing whether controls are correctly enforced at the backend layer or only at frontend — backend enforcement is mandatory
  • Ensuring no data can be accessed by bypassing the UI directly via API
Delivery & Team Excellence

A proposed framework for how Canada and India teams plan, build, and ship together — one process, one standard, one product. Four pillars: Visibility, Quality, Automation, and Scalability.

One Team Structure
  • Canada — Product website, client features, customer coordination, business requirements
  • India — App and backend, infrastructure, DevOps, platform engineering, performance
  • Daily Scrum (15–30 min) across both teams — blockers, risks, integration topics
  • Weekly Architecture and Design Sync for deep technical alignment
Jira & Confluence — Single Source of Truth
  • No Jira ticket = No work — every change, no matter how small
  • Full hierarchy: Epics → Stories → Sub-tasks → Bugs and Risks
  • Both teams, one board, full transparency
  • Severity-classified bugs triaged in every daily scrum
  • All documentation, decisions, and runbooks maintained in Confluence
Story Quality Standard
  • Clear Scope — every dev, QA, BA must articulate it in one sentence
  • Technical Approach — API contracts and integration points made explicit
  • Impacted Areas — all affected modules listed upfront (most critical element)
  • Test Cases written before development starts (shift-left)
  • BA and QA sign-off required before any story is closed
In-Sprint Automation
  • Mandate: Build it, automate it, in the same sprint — no deferrals
  • Flow: Groom → Dev → QA → Automation Written → Evidence Captured → Story Closed
  • Testing evidence required on Dev and Staging environments
Architecture Design Records (ADRs)
  • Every significant architectural decision must be documented as an ADR in Confluence
  • Captures context, options considered, decision made, and trade-offs — permanent reference for the team
  • Prevents repeated debate on settled decisions and ensures new team members have full context
  • ADRs linked directly to the relevant Jira epic or story
Four Pillars:   Visibility — every work item in Jira  ·  Quality — shift-left testing, sign-off required  ·  Automation — in-sprint, no deferrals  ·  Scalability — scale considered in every design decision from day one
NineHertz Support Coverage & Engagement

Introducing Rohit Rao (CTO, NineHertz) and Arpit (Cloud Security Consultant) to monitor progress, align priorities, and accelerate delivery across security and infrastructure.

🎤
NineHertz CTO
Rohit Rao
CTO Introduction

Rohit Rao to be introduced to align on support expectations, ongoing priorities, and delivery direction across all active workstreams.

Aligns delivery direction
Establishes ongoing touchpoint
Single escalation point for CRO & CEO
🛡
Cloud Security Consultant
Arpit
Security Coverage

Arpit included in all security discussions to ensure coverage is aligned with planned allocation and remediation is actively monitored.

Planned allocation honoured
Security ownership clarified
Incident remediation tracked end-to-end
👥
Day-to-Day
Working Coordination

NineHertz team works closely with DogPack on day-to-day coordination, ensuring visibility into open items, blockers, and progress across all workstreams.

Direct line for execution
Faster issue resolution
Progress visible to CRO & CEO at all times