# URL Shortener

When discussing the system design of a URL shortener like TinyURL or Bit.ly, ensure you cover the following key aspects. Structuring your notes around these will help you stay organized during the interview.

***

## Requirements

### **Functional Requirements**

* Shorten a given URL and return a unique, shortened version
* Redirect users to the original URL when they use the shortened URL.
* Custom alias support
* Optionally, allow expiration and analytics (clicks, user analytics, etc.).

### **Non-Functional Requirements**

* **Focus on availability over consistency**: The service should always be up.
* **Low Latency**: Redirection should be fast.
* **Scalability**: Support millions or billions of URLs.
* **Durability**: Store URLs reliably to prevent data loss.

***

## **Estimation and Capacity Planning**

* **QPS (Queries Per Second)**: Estimate reads and writes (e.g., 10:1 ratio).
* **Storage Requirements**: Assume average URL length and calculate storage needs (e.g., 1 billion URLs).

***

## Entities

* Original URL
* Shortened URL
* User

***

## **API Design**

* **POST `/shorten`** -> returns a shortened URL.
  * { originalUrl, customAlias?, expirationTime? }
* **GET `{shortUrl}`**->  Redirects to the original URL.
  * **HTTP 302 (temporary redirect)** instead of HTTP 301 (permanent redirect which browser caches)
* Optional APIs for stats or management:
  * **GET `/stats/{shortUrl}`**.

**b) Database Schema**

* `ShortUrl (short_key, original_url, created_at, expiration_date)`
* Index on `short_key` for fast lookups.

**c) URL Shortening Logic**

* **Short Key Generation**:
  * Use a **Base62** encoding scheme (characters `a-z`, `A-Z`, `0-9`) to keep keys short and human-readable.
    * A 6-character Base62 key can represent over 56 billion unique URLs (62⁶).
  * **Counter/Sequence**: Increment a global counter and encode the value in Base62.
    * A single Redis instance on modern hardware can handle **80,000 to 100,000 operations per second (OPS)** for simple commands like `INCR`, assuming:
      * Redis is running on dedicated high-performance hardware.
      * The network is low-latency and high-bandwidth.
      * There are no significant memory or disk I/O bottlenecks.
* Handle **collisions** in case of duplicate keys.

**d) Data Storage**

* Choose a storage system based on scale:
  * **Relational Databases**: MySQL/PostgreSQL for small-scale systems.
  * **NoSQL Databases**: DynamoDB, Cassandra, or MongoDB for large-scale systems.
  * In-memory caching (Redis or Memcached) for frequently accessed data.

**e) Redirection**

* Use **HTTP 301 (Moved Permanently)** or **302 (Found)** for redirection.

***

## **Scaling the System**

**a) Read/Write Patterns**

* Reads will dominate writes (many more redirections than shortenings).
* Optimize for high-read throughput using caching.

**b) Partitioning and Replication**

* Use database replication for durability, fault tolerance and to allow scaling reads horizontally.
* Partition data by the short key to distribute load (consistent hashing).

**c) Caching**

* Use Redis or Memcached to cache mappings of frequently accessed `short_key -> original_url`.

**d) Rate Limiting**

* Implement rate limiting to prevent abuse (e.g., spamming the shorten API).

***

## **High-Level Architecture**

<figure><img src="/files/b1tpY7uEIx7dfdRHz1Ou" alt=""><figcaption></figcaption></figure>

***

## **Trade-offs and Challenges**

* **Key Length**: Shorter keys are user-friendly but increase the chance of collisions.
* **Consistency**: Balancing consistency and availability (CAP theorem).
* **Redundancy**: Ensure data is not lost due to server failures.
* **Performance**: Optimize key lookup and redirection latency.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://notes.mikaelsamvelian.com/system-design/url-shortener.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
