👨‍💻
Mike's Notes
  • Introduction
  • MacOs Setup
    • System Preferences
    • Homebrew
      • Usage
    • iTerm
      • VIM
      • Tree
      • ZSH
    • Visual Studio Code
    • Git
    • SSH Keys
  • DevOps Knowledge
    • SRE
      • Scaling Reliably
        • Splitting a Monolith into Microservices
      • Troubleshooting Common Issues
      • Service Level Terminology
      • Toil
      • Monitoring
      • Release Engineering
      • Best Practices
      • On-Call
      • Alerting
    • Containers
      • Docker
        • Best Practices
          • Image Building
          • Docker Development
        • CLI Cheat Sheet
      • Container Orchestration
        • Kubernetes
          • Benefits
          • Cheat Sheet
          • Components
          • Pods
          • Workload Resources
          • Best Practices
    • Developer Portal 👨‍💻
      • Solution Overview 🎯
      • System Architecture 🏗️
      • Implementation Journey 🛠️
      • Cross-team Collaboration 🤝
      • Lessons & Future 🎓
    • Provisioning
      • Terraform
        • Installation
        • Usage
    • Configuration Management
      • Ansible
        • Benefits
        • Installation
    • Build Systems
      • Bazel
        • Features
  • Security
    • Secure Software Engineering
    • Core Concepts
    • Security Design Principles
    • Software Security Requirements
    • Compliance Standards and Policies
      • Sarbanes-Oxley (SOX)
      • HIPAA and HITECH
      • Payment Card Industry Data Security Standard (PCI-DSS)
      • General Data Protection Regulation (GDPR)
      • California Consumer Privacy Act (CCPA)
      • Federal Risk and Authorization Management Program (FedRAMP)
    • Privacy & Data
  • Linux Fundamentals
    • Introduction to Linux
    • Architecture
    • Server Administration
      • User / Groups
      • File Permissions
      • SSH
      • Process Management
    • Networking
      • Diagrams
      • Browser URL Example
      • Network Topologies
      • Signal Routing
      • DNS (Domain Name System)
      • SSL (Secure Sockets Layer)
      • TLS (Transport Layer Security)
  • System Design
    • Process
    • Kafka
      • Advanced Topics
    • URL Shortener
Powered by GitBook
On this page
  • Requirements
  • Functional Requirements
  • Non-Functional Requirements
  • Estimation and Capacity Planning
  • Entities
  • API Design
  • Scaling the System
  • High-Level Architecture
  • Trade-offs and Challenges

Was this helpful?

  1. System Design

URL Shortener

When discussing the system design of a URL shortener like TinyURL or Bit.ly, ensure you cover the following key aspects. Structuring your notes around these will help you stay organized during the interview.


Requirements

Functional Requirements

  • Shorten a given URL and return a unique, shortened version

  • Redirect users to the original URL when they use the shortened URL.

  • Custom alias support

  • Optionally, allow expiration and analytics (clicks, user analytics, etc.).

Non-Functional Requirements

  • Focus on availability over consistency: The service should always be up.

  • Low Latency: Redirection should be fast.

  • Scalability: Support millions or billions of URLs.

  • Durability: Store URLs reliably to prevent data loss.


Estimation and Capacity Planning

  • QPS (Queries Per Second): Estimate reads and writes (e.g., 10:1 ratio).

  • Storage Requirements: Assume average URL length and calculate storage needs (e.g., 1 billion URLs).


Entities

  • Original URL

  • Shortened URL

  • User


API Design

  • POST /shorten -> returns a shortened URL.

    • { originalUrl, customAlias?, expirationTime? }

  • GET {shortUrl}-> Redirects to the original URL.

    • HTTP 302 (temporary redirect) instead of HTTP 301 (permanent redirect which browser caches)

  • Optional APIs for stats or management:

    • GET /stats/{shortUrl}.

b) Database Schema

  • ShortUrl (short_key, original_url, created_at, expiration_date)

  • Index on short_key for fast lookups.

c) URL Shortening Logic

  • Short Key Generation:

    • Use a Base62 encoding scheme (characters a-z, A-Z, 0-9) to keep keys short and human-readable.

      • A 6-character Base62 key can represent over 56 billion unique URLs (62⁶).

    • Counter/Sequence: Increment a global counter and encode the value in Base62.

      • A single Redis instance on modern hardware can handle 80,000 to 100,000 operations per second (OPS) for simple commands like INCR, assuming:

        • Redis is running on dedicated high-performance hardware.

        • The network is low-latency and high-bandwidth.

        • There are no significant memory or disk I/O bottlenecks.

  • Handle collisions in case of duplicate keys.

d) Data Storage

  • Choose a storage system based on scale:

    • Relational Databases: MySQL/PostgreSQL for small-scale systems.

    • NoSQL Databases: DynamoDB, Cassandra, or MongoDB for large-scale systems.

    • In-memory caching (Redis or Memcached) for frequently accessed data.

e) Redirection

  • Use HTTP 301 (Moved Permanently) or 302 (Found) for redirection.


Scaling the System

a) Read/Write Patterns

  • Reads will dominate writes (many more redirections than shortenings).

  • Optimize for high-read throughput using caching.

b) Partitioning and Replication

  • Use database replication for durability, fault tolerance and to allow scaling reads horizontally.

  • Partition data by the short key to distribute load (consistent hashing).

c) Caching

  • Use Redis or Memcached to cache mappings of frequently accessed short_key -> original_url.

d) Rate Limiting

  • Implement rate limiting to prevent abuse (e.g., spamming the shorten API).


High-Level Architecture


Trade-offs and Challenges

  • Key Length: Shorter keys are user-friendly but increase the chance of collisions.

  • Consistency: Balancing consistency and availability (CAP theorem).

  • Redundancy: Ensure data is not lost due to server failures.

  • Performance: Optimize key lookup and redirection latency.

PreviousAdvanced Topics

Last updated 5 months ago

Was this helpful?