Sven Ruppert<p><strong>Short links, clear architecture – A URL shortener in Core Java</strong></p><p>A URL shortener seems harmless – but if implemented incorrectly, it opens the door to phishing, enumeration, and data leakage. In this first part, I’ll explore the theoretical and security-relevant fundamentals of a URL shortener in Java – without any frameworks, but with a focus on entropy, collision tolerance, rate limiting, validity logic, and digital responsibility. The second part covers the complete implementation: modular, transparent, and as secure as possible.</p> <ol><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#1-1-motivation-and-use-cases" rel="nofollow noopener" target="_blank">1.1 Motivation and use cases</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#1-2-differentiation-from-related-technologies" rel="nofollow noopener" target="_blank">1.2 Differentiation from related technologies</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#1-3-objective-of-the-paper" rel="nofollow noopener" target="_blank">1.3 Objective of the paper</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#2-1-uri-url-and-urn-conceptual-basics" rel="nofollow noopener" target="_blank">2.1 URI, URL and URN – conceptual basics</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#2-2-principles-of-address-shortening" rel="nofollow noopener" target="_blank">2.2 Principles of address shortening</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#2-3-entropy-collisions-and-permutation-spaces" rel="nofollow noopener" target="_blank">2.3 Entropy, collisions and permutation spaces</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#4-2-url-encoding-hashing-base62-and-alternatives" rel="nofollow noopener" target="_blank">4.2 URL Encoding: Hashing, Base62 and Alternatives</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#4-3-mapping-store-interface-implementation-synchronisation" rel="nofollow noopener" target="_blank">4.3 Mapping Store: Interface, Implementation, Synchronisation</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#4-4-rest-api-with-pure-java-http-server-handler-routing" rel="nofollow noopener" target="_blank">4.4 REST API with pure Java (HTTP server, handler, routing)</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#4-5-error-handling-logging-and-monitoring" rel="nofollow noopener" target="_blank">4.5 Error handling, logging and monitoring</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#5-1-abuse-opportunities-and-protection-mechanisms" rel="nofollow noopener" target="_blank">5.1 Abuse opportunities and protection mechanisms</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#5-2-rate-limiting-and-ip-based-throttling" rel="nofollow noopener" target="_blank">5.2 Rate limiting and IP-based throttling</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#5-3-validity-period-and-deletion-concepts" rel="nofollow noopener" target="_blank">5.3 Validity period and deletion concepts</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#5-4-protection-against-enumeration-and-information-leakage" rel="nofollow noopener" target="_blank">5.4 Protection against enumeration and information leakage</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#6-1-access-times-and-hash-lookups" rel="nofollow noopener" target="_blank">6.1 Access times and hash lookups</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#6-2-memory-usage-and-garbage-collection" rel="nofollow noopener" target="_blank">6.2 Memory usage and garbage collection</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#6-3-benchmarking-local-tests-and-load-simulation" rel="nofollow noopener" target="_blank">6.3 Benchmarking: Local tests and load simulation</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#7-1-custom-aliases" rel="nofollow noopener" target="_blank">7.1 Custom aliases</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#7-2-access-counting-and-analytics" rel="nofollow noopener" target="_blank">7.2 Access counting and analytics</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#7-3-qr-code-integration" rel="nofollow noopener" target="_blank">7.3 QR-Code-Integration</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#7-4-integration-into-messaging-or-tracking-systems" rel="nofollow noopener" target="_blank">7.4 Integration into messaging or tracking systems</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#8-1-data-protection-for-link-tracking" rel="nofollow noopener" target="_blank">8.1 Data protection for link tracking</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#8-2-responsibility-for-forwarding" rel="nofollow noopener" target="_blank">8.2 Responsibility for forwarding</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#8-3-transparency-and-disclosure-of-the-destination-address" rel="nofollow noopener" target="_blank">8.3 Transparency and disclosure of the destination address</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#9-1-lessons-learned" rel="nofollow noopener" target="_blank">9.1 Lessons Learned</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#9-2-possible-further-developments-e-g-blockchain-dnssec" rel="nofollow noopener" target="_blank">9.2 Possible further developments (e.g. blockchain, DNSSEC)</a></li><li><a class="" href="https://svenruppert.com/2025/06/10/short-links-clear-architecture-a-url-shortener-in-core-java/#9-3-importance-of-url-shorteners-in-the-context-of-digital-sovereignty" rel="nofollow noopener" target="_blank">9.3 Importance of URL shorteners in the context of digital sovereignty</a></li></ol> <p><strong><strong>1.1 Motivation and use cases</strong></strong></p><p>In an increasingly fragmented and mobile information world, URLs are not just technical addressing mechanisms; they are central building blocks of digital communication. Long and hard-to-remember URLs are a hindrance in social media, emails, or QR codes, as they are not only aesthetically unappealing but also prone to errors when manually entered. URL shorteners address this problem by generating compact representations that point to the original target address. In addition to improved readability, aspects such as statistical analysis, access control, and campaign tracking also play a key role.</p><p>Initially popularised by services like TinyURL or bit.ly, URL shorteners have now become integrated into many technical infrastructures – from marketing platforms and messaging systems to IoT applications, where storage and bandwidth restrictions play a significant role. A shortened representation of URLs is also a clear advantage in the context of QR codes or limited character sets (e.g., in SMS or NFC data sets).</p><p><strong><strong>1.2 Differentiation from related technologies</strong></strong></p><p>A URL shortener is not a classic forwarding platform and is conceptually different from proxy systems, link resolvers, or load balancers. While the latter often operate at the transport or application layer (Layer 4 or Layer 7 in the OSI model) and optimise transparency, availability, or performance, a shortener primarily pursues the goal of simplifying the display and management of URLs. Nevertheless, there are overlaps, particularly in the analysis of access patterns and the configuration of redirect policies.</p><p>In this work, a minimalist URL shortener is designed and implemented. It deliberately avoids external frameworks to implement the central concepts in a comprehensible and transparent manner in Core Java. The choice of Java 24 enables the integration of modern language features, such as records, sealed types, and virtual threads, into a secure and robust architecture.</p><p><strong><strong>1.3 Objective of the paper</strong></strong></p><p>This paper serves a dual purpose: on the one hand, it aims to provide a deep technical understanding of the functionality and challenges associated with a URL shortener. On the other hand, it serves as a practical guide for implementing such a service using pure Java—that is, without Spring, Jakarta EE, or external libraries.</p><p>To this end, a comprehensive architecture will be developed, implemented, and continually enhanced with key aspects such as security, performance, and extensibility. The focus is deliberately on a system-level analysis of the processes to provide developers with a deeper understanding of the interaction between the network layer, coding strategies, and persistent storage. The goal is to develop a viable model that can be utilised in both educational contexts and as a basis for productive services.</p><p><strong><strong>2. Technical background</strong></strong></p><p><strong><strong>2.1 URI, URL and URN – conceptual basics</strong></strong></p><p>In everyday language, terms such as “URL” and “link” are often used synonymously, although in a technical sense, they describe different concepts.<strong>URI (Uniform Resource Identifier)</strong>refers to any character string that can uniquely name or locate a resource. A URL<strong> (Uniform Resource Locator)</strong> is a special form of a URI that not only identifies but also describes the access path, for example, through a protocol such as https, ftp, or mailto. A <strong>URN (Uniform Resource Name), </strong>on the other hand, names a resource persistently without referring to its physical address, such as urn:isbn:978-3-16-148410-0.</p><p>In the context of URL shorteners, URLs are exclusively concerned with accessible paths, typically via HTTP or HTTPS. The challenge is to transform these access paths in a way that preserves their semantics while reducing their representation.</p><p><strong><strong>2.2 Principles of address shortening</strong></strong></p><p>The core idea of a URL shortener is to replace a long URL string with a shorter key that points to the original address via a mapping. This mapping is done either directly in a lookup store (e.g., hash map, database table) or indirectly via a computational method (e.g., a hash function with collision management).</p><p>The goal is to use the redundancy of long URLs to map their entropy to a significantly shorter string. This poses a trade-off between collision-freeness, brevity, and readability. Conventional methods are based on encoding unique keys in a Base62 alphabet ([0-9a-zA-Z]), which offers 62 states per character. Just six characters can represent over 56 billion unique URLs—sufficient for many productive applications.</p><p>The shortcode acts as the primary key for address resolution. It is crucial that it is stable, efficiently generated, and as challenging to guess as possible to prevent misuse (e.g., brute-force enumeration).</p><p><strong><strong>2.3 Entropy, collisions and permutation spaces</strong></strong></p><p>A key aspect of URL shortening is the question of how many different short addresses a system can actually generate. This consideration directly depends on the length of the generated shortcuts and their character set. Many URL shorteners use a so-called Base62 alphabet. This includes the ten digits from zero to nine, the 26 lowercase letters, and the 26 uppercase letters, for a total of 62 different characters.</p><p>For example, if you generate abbreviations with a fixed length of six characters, you get a combinatorial space in which over 56 billion different character strings are possible. Even with this relatively short number of characters, billions of unique URLs can be represented, which is more than sufficient for many real-world applications. For longer abbreviations, the address space grows exponentially.</p><p>But the sheer number of possible combinations is only one aspect. How these shortcuts are generated is equally important. If the generation is random, it is essential to ensure that no duplicate codes are created – so-called collisions. These can be managed either by checking for their existence beforehand or by deterministic methods such as hash functions. However, hash methods are not without risks, especially under heavy load: The more entries there are, the higher the probability that two different URLs will receive the same short code, especially if the hash function has not been optimised for this use case.</p><p>Another criterion is the distribution of the generated shortcuts. A uniform distribution in the address space is desirable because, on the one hand, it reduces the risk of collisions, and on the other hand, it increases the efficiency of storage and retrieval mechanisms – for example, in sharding for distributed systems or caching in high-traffic environments. Cryptographically secure random numbers or specially designed generators play a crucial role here.</p><p>Overall, it can be said that the choice of alphabet, the length of the abbreviations and the way they are generated are not just technical parameters, but fundamental design decisions that significantly influence the security, efficiency and scalability of a URL shortener.</p><p><strong><strong>3. Architecture of a URL shortener</strong></strong></p><p>The architecture of a URL shortener is surprisingly compact at its core, but by no means trivial. Although its basic function is simply to link a long URL with a short alias, numerous technical and conceptual decisions arise in the details. These include data storage, the structure of API access, concurrency behaviour, and security against misuse. This chapter explains the central components and their interaction, deliberately avoiding external frameworks. Instead, the focus is on a modular, transparent structure in pure Java.</p><p>At the heart of the system is a mapping table – typically in the form of a map or a persistent key-value database – that uniquely assigns each generated short code to its corresponding original URL. This structure forms the backbone of the shortener. Crucially, this mapping must be both efficiently readable and consistently modifiable, especially under load or when accessed concurrently by multiple clients.</p><p>A typical URL shortener consists of three logically separate units: an input endpoint for registering a new URL, a redirection endpoint for evaluating a short link, and a management unit that provides metadata such as expiration times or access counters. In a purely Java-based solution without frameworks, network access is provided via the HTTP server introduced in Java 18. <strong>com.sun.net.httpserver</strong> package. This allows you to define REST-like endpoints with minimal overhead and to communicate with <strong>HttpExchange</strong> objects.</p><p>There are various options for storing mappings. In-memory structures, such as <strong>ConcurrentHashMap</strong>, offer maximum speed but are volatile and unsuitable for productive applications without a backup mechanism. Alternatively, file-based formats, relational databases, or object-oriented stores such as EclipseStore can be used. This paper will initially work with volatile storage to illustrate the basic logic. Persistence will be added modularly later.</p><p>Another key aspect concerns concurrency behaviour. Since URL shorteners are typically burdened by a large number of read accesses, for example, when calling short links, the architecture must be designed to allow concurrent access to the lookup table without locking conflicts. The same applies to the generation of new shortcuts, which must be atomic and collision-free. Java 24 introduces modern language tools, including virtual threads and structured concurrency, which can be utilised to manage server load in a more deterministic and scalable manner.</p><p>Last but not least, horizontal extensibility plays a role. A cleanly decoupled design allows the shortener to be easily transferred to distributed systems later. For example, the actual URL resolver can be operated as a stateless service, while data storage is outsourced to a shared backend. Caching strategies and load balancing can also be integrated much more easily in such a setup.</p><p>In summary, a URL shortener is much more than a simple string replacement. Its architecture must be both efficient, robust, and extensible—properties that can be easily achieved through a modular structure in pure Java.</p><p><strong><strong>4. Implementation with Java 24</strong></strong></p><p><strong><strong>4.1 Project structure and module overview</strong></strong></p><p>The implementation of the URL shortener follows a modular structure that supports both clarity in the source code and testability, as well as extensibility. The project is structured as a Java module and leverages the capabilities of the Java Platform Module System (JPMS). The goal is to separate the core functionality—that is, the management of URL mappings—from the network layer and persistence. This keeps the business logic independent of specific storage or transport mechanisms.</p><p>At the centre is a module called <strong>shortener.core</strong>, which contains all domain-specific classes: for example, the ShortUrlMapping, the UrlEncoder, as well as the central <strong>UrlMappingStore</strong> interface with a simple implementation in memory. A module <strong>shortener.http</strong>, which is based on Java’s internal HTTP server. It implements the REST endpoints and utilises the core module’s components for actual processing. Additional optional modules, such as those for persistence or analysis, can be added later.</p><p>To organise the code, a directory structure that clearly reflects the module and layer boundaries is recommended. Within the modules, a distinction should be made between <strong>api</strong>, <strong>impl</strong>, <strong>util</strong> and, if necessary, <strong>service</strong>.</p><p><strong><strong>4.2 URL Encoding: Hashing, Base62 and Alternatives</strong></strong></p><p>A central element of the shortener is the mechanism for generating short, unique codes. This implementation uses a hybrid method that generates a consecutive, atomic sequence number and converts it into a human-readable format using a Base62 encoder.</p><p>This choice has two advantages: First, it is deterministic and avoids collisions without the need for complex hash functions. Second, generated codes can be efficiently serialised and are easy to read, which is particularly relevant in marketing or print contexts. Alternatively, cryptographic hashes such as SHA-256 can be used when unpredictability and integrity protection are essential, for example, for signed links or zero-knowledge schemes.</p><p>The Base62 encoder is implemented as a pure utility class that encodes integer values into a character string, where the alphabet consists of numbers and letters. Inverse decoding is also provided in case bidirectional analysis is required in the future.</p><p><strong><strong>4.3 Mapping Store: Interface, Implementation, Synchronisation</strong></strong></p><p>For managing URL mappings, a clearly defined interface called <strong>UrlMappingStore</strong> provides methods for inserting new mappings, resolving short links, and optionally managing metadata. The default implementation, InMemoryUrlMappingStore, is based on a ConcurrentHashMap and utilises <strong>AtomicLong</strong> for sequence number generation.</p><p>This simple architecture is completely thread-safe and allows parallel access without external synchronisation mechanisms. The implementation can be replaced at any time with a persistent variant, for example, based on flat file storage or through integration with an object-oriented storage system such as EclipseStore.</p><p>This separation keeps the application core stable while treating storage as a replaceable detail—a classic example of the dependency inversion principle in the spirit of Clean Architecture.</p><p><strong><strong>4.4 REST API with pure Java (HTTP server, handler, routing)</strong></strong></p><p>The REST interface is implemented exclusively with the built-in tools of the JDK. Java provides the package <strong>com.sun.net.httpserver</strong>, which offers a minimalistic yet powerful HTTP server ideal for lean services. For the implementation of the API, a separate <strong>HttpHandler</strong> is defined that responds to specific routes, such as <strong>/shorten</strong> for POST requests and <strong>/{code}</strong> for forwarding.</p><p>The implementation is based on a clear separation between parsing, processing, and response generation. Incoming JSON messages are parsed manually or with the help of simple helper classes, without the need for external libraries. HTTP responses also follow a minimalist format, characterised by structured status codes, simple header management, and UTF-8-encoded bodies.</p><p>Routing is handled by a dispatcher class, which selects the appropriate handler based on the request path and HTTP method. Later extensions, such as CORS, OPTIONS handling, or versioning, are easily possible.</p><p><strong><strong>4.5 Error handling, logging and monitoring</strong></strong></p><p>In a productive environment, robust error handling is essential. The implementation distinguishes between systematic errors (such as invalid inputs or missing short codes) and unexpected runtime errors (such as IO problems or race conditions). The former are reported with clear HTTP status codes, such as <strong>400 (Bad Request)</strong> or <strong>404 (Not Found).</strong> The latter leads to a generic <strong>500 Internal Server Erro</strong>r, with the causes being logged internally.</p><p>For logging, the JDK’s own <strong>java.util.logging</strong> This allows for platform-independent logging and can be replaced with SLF4J-compatible systems if needed. Monitoring metrics such as access counts, response times, or error statistics can be made accessible via a separate endpoint or JMX.</p><p><strong><strong>5. Security aspects</strong></strong></p><p><strong><strong>5.1 Abuse opportunities and protection mechanisms</strong></strong></p><p>A URL shortener can easily be used to obscure content. Attackers deliberately exploit the shortening to redirect recipients to phishing sites, malware hosts, or dubious content without the target address being immediately visible. This can pose significant risks, especially for automated distributions via social networks, chatbots, or email campaigns.</p><p>An adequate protection mechanism consists of automatically validating all target addresses upon insertion, for example, through syntactical URL checks, DNS resolution, and optionally through a background query (head request or proxy scan) that ensures that the target page is accessible and non-suspicious. Such checks should be modular so that they can be activated or deactivated depending on the environment (e.g., offline operation). Additionally, logging should be performed every time a short link is accessed, making it easier to identify patterns of abuse.</p><p><strong><strong>5.2 Rate limiting and IP-based throttling</strong></strong></p><p>Another risk lies in excessive use of the service, be it through botnets, targeted enumeration, or simple DoS behaviour. A robust URL shortener should therefore have rate limiting that restricts requests within a given time slot. This can be global, IP-based, or per-user, depending on the context.</p><p>In a Java implementation without frameworks, this can be achieved, for example, via a <strong>ConcurrentHashMap</strong> that maintains a timestamp or counter buffer for each IP address. If a threshold is exceeded, the connection is terminated with a status code of <strong>429 Too Many Requests</strong> rejected. This simple throttling can be supplemented with leaky bucket or token bucket algorithms if necessary to achieve a fairer distribution over time. For productive use, logging of critical threshold violations is also recommended.</p><p><strong><strong>5.3 Validity period and deletion concepts</strong></strong></p><p>Not every short link should remain valid forever. A configurable validity period is essential, especially for security-critical applications, such as temporary document sharing or one-time authentication. A URL shortener should therefore offer the option of defining expiration times for each mapping.</p><p>On a technical level, it is sufficient to assign an expiration date to each mapping, which is checked during the lookup. When accessing expired short links, either an error status, such as <strong>410 Gone,</strong> is displayed, or the user is redirected to a defined information page. Additionally, there should be periodic cleanup mechanisms that remove expired or unused entries from memory, such as through a time-controlled cleanup process or lazy deletion upon access.</p><p><strong><strong>5.4 Protection against enumeration and information leakage</strong></strong></p><p>An often overlooked attack vector is the systematic scanning of the abbreviation space – for example, by automated retrieval of <strong>/aaaaaa</strong> until <strong>/zzzzzz.</strong> If a URL shortener delivers valid links without any protection mechanisms, potentially confidential information about the existence and use of links can be leaked.</p><p>An adequate protection consists in making the shortcuts themselves non-deterministic – for example, by using cryptographically generated, unpredictable tokens instead of continuous sequences. Additionally, access restrictions can be introduced, allowing only authenticated clients to access certain short links or excluding specific IP ranges. The targeted obfuscation of error responses – for example, by consistently issuing <strong>404 Not Found</strong> even with blocked or expired abbreviations – makes analysis more difficult for attackers.</p><p>A further risk arises when metadata such as creation time, number of accesses, or request origin is exposed unprotected via the API. Such information should only be accessible to authorised users or administrative interfaces and should never be part of the public API output.</p><p><strong><strong>6. Performance and optimisation</strong></strong></p><p><strong><strong>6.1 Access times and hash lookups</strong></strong></p><p>The most common operation in a URL shortener is resolving a shortcode into its corresponding original URL. Since this is a classic lookup operation, the choice of the underlying data structure is crucial. In the standard implementation, a <strong>ConcurrentHashMap,</strong> which is optimised in Java 24, has fine-grained locking. This offers nearly constant access times – even under high concurrency – and is therefore ideal for read-intensive workloads, such as those typical of a shortener.</p><p>The latency of such an operation is in the range of a few microseconds, provided the lookup table is stored in main memory and no additional network or IO layers are involved. However, if data storage is outsourced to persistent systems, such as a relational database or a disk-based key-value store, the access time increases accordingly. Therefore, it is recommended to cache frequently accessed entries – either directly in memory or via a dedicated cache layer.</p><p>Performance also plays a role in the creation of new abbreviations. This is where sequence number generation using <strong>AtomicLong</strong> is used, providing a thread-safe, low-contention solution for linear ID assignment. Combined with Base62 encoding, this creates a fast, predictable, and collision-free process.</p><p><strong><strong>6.2 Memory usage and garbage collection</strong></strong></p><p>Since a URL shortener must manage a growing number of entries over a longer period, it is worthwhile to examine its storage behaviour. <strong>ConcurrentHashMap.</strong> While this results in fast access times, it also means that all active mappings remain permanently in memory—unless cleanup is implemented. A simple mapping structure consisting of a shortcode, original URL, and an optional timestamp requires several hundred bytes per entry, depending on the JVM configuration and string length.</p><p>With several million entries, heap usage can reach several gigabytes. To improve efficiency, care should be taken to use objects sparingly. For example, common URL prefixes (e.g. <strong>https://</strong>) are replaced with symbolic constants. Records instead of classic POJOs also help reduce object size and minimise GC load.</p><p>In the long term, it is recommended to introduce an active or passive cleanup mechanism, such as TTL-based eviction or access counters, to specifically remove rarely used entries. <strong>WeakReference</strong> or soft caching should be considered with caution, since the semantics of such structures do not always lead to expected behaviour in the server context.</p><p><strong><strong>6.3 Benchmarking: Local tests and load simulation</strong></strong></p><p>Systematic benchmarking is essential for objectively evaluating the performance of a URL shortener. At a local level, this can be achieved with simple Java benchmarks that measure sequence number generation, lookup time, and code distribution quality. Tools such as JMH (Java Microbenchmark Harness) can also be used. Although external tools are not used in this paper, a manual microbenchmarking approach using System.nanoTime and a targeted warm-up can provide valuable insights.</p><p>For more realistic tests, a load simulation with HTTP clients is suitable, for example, using simple JDK-based multi-thread scripts or tools such as <strong>curl</strong>. In particular, behaviour under high concurrent access load should be observed, both in terms of response times and resource consumption. Behaviour in the event of failed requests, rapid-fire access, or expired links should also be explicitly tested.</p><p>The goal of such benchmarks is not only to validate the maximum transaction rate, but also to verify stability under continuous load. A robust implementation should not only be high-performance but also deterministic in its response behaviour and resistant to out-of-memory errors. Optional profiling—for example, using JDK Flight Recorder—can reveal further optimisation potential.</p><p><strong><strong>7. Expansion options and variants</strong></strong></p><p><strong><strong>7.1 Custom aliases</strong></strong></p><p>A frequently expressed wish in practice is the ability to not only use automatically generated short links, but also to assign custom aliases – for example, for marketing campaigns, internal documents, or individual redirects. A custom alias, such as <strong>/travel2025</strong> is much easier to remember than a random Base62 token and can be integrated explicitly into communication and branding.</p><p>Technically speaking, this expands the mapping store’s responsibility. Instead of only accepting numerically generated keys, the API must verify that a user-defined alias is syntactically valid, not already in use, and not reserved. A simple regex check, supplemented by a negative list for reserved terms (e.g. <strong>/admin</strong>, <strong>/api</strong>), is sufficient to get started. This alias must then be treated equally to the automatically generated codes when stored.</p><p>This creates new failure modes, for example, when a user requests an alias that already exists. Such cases should be handled consistently with a <strong>409 Conflict.</strong> The API can optionally suggest alternative names—a small convenience feature with a significant impact on the user experience (UX).</p><p><strong><strong>7.2 Access counting and analytics</strong></strong></p><p>A functional URL shortener is more than just a redirection tool—it’s also an analytics tool. Tracking how often, when, and from where a short link was accessed is particularly relevant in the context of campaigns, product pages, or documented distribution.</p><p>To implement this functionality, each successful resolution of a short link must be saved as an event, either by simply incrementing a counter or by fully logging with a timestamp, IP address, and user agent. For the in-memory variant, an additional <strong>AtomicLong</strong> or a metric structure aggregated via a map. Alternatively, detailed access data can be persisted in a dedicated log file or an external analytics module.</p><p>The evaluation can be performed either synchronously via API endpoints (e.g.,/stats/{alias}) or asynchronously via export formats such as JSON, CSV, or Prometheus metrics. Integration with existing logging systems (e.g. via <strong>java.util.logging</strong> or <strong>Logstash</strong>) is easily possible.</p><p><strong><strong>7.3 QR-Code-Integration</strong></strong></p><p>For physical media, such as posters, packaging, or invitations, displaying a short link as a QR code is a useful extension. Integrating QR code generation into the URL shortener enables the direct generation of a visually encoded image of the link from the API.</p><p>Since no external libraries are used, QR code generation can be performed using a compact Java-based algorithm, such as one based on bit matrix generation and SVG output. Alternatively, a Base64-encoded PNG file can be delivered via an endpoint URL such as <strong>/qr/{alias}.</strong> The underlying data structure remains unchanged – only the representation is extended.</p><p>This feature not only enhances practical utility but also expands the service’s reach across multiple media channels.</p><p><strong><strong>7.4 Integration into messaging or tracking systems</strong></strong></p><p>In production architectures, a URL shortener typically operates in conjunction with other components. Instead, it is part of larger pipelines – for example, in email delivery, chatbots, content management systems, or user interaction tracking. Flexible integration with messaging systems such as Kafka, RabbitMQ, or simple webhooks allows every link creation or access to be transmitted as an event to external systems.</p><p>In a pure Java environment, this can be done via simple HTTP requests, log files, or asynchronous event queues. Scenarios are conceivable in which a notification is automatically sent to a third-party system for each new short link, for example, to generate personalised campaigns or for auditing purposes. Access to short links can also be mapped via events, which are subsequently statistically evaluated or visualised in dashboards.</p><p>Depending on the level of integration, it is recommended to implement a dedicated event dispatcher that encapsulates incoming or outgoing events and forwards them in a loosely coupled manner. This keeps the shortener itself lean and responsibilities clearly distributed.</p><p><strong><strong>8. Legal and ethical aspects</strong></strong></p><p><strong><strong>8.1 Data protection for link tracking</strong></strong></p><p>A URL shortener that logs visits automatically operates within the framework of data protection law. As soon as data such as IP addresses, timestamps, or user agents are stored, it is considered personal information in the legal sense, at least potentially. In the European Union, such data falls under the General Data Protection Regulation (GDPR), which entails specific obligations for operators.</p><p>The technical capability for analytics—for example, through access counting or geo-IP analysis—should therefore not be enabled implicitly. Instead, a URL shortener should be designed so that tracking mechanisms must be explicitly enabled, ideally with clear labelling for the end user. A differentiated configuration that distinguishes between anonymised and personal data collection is strongly recommended in professional environments.</p><p>Additionally, when storing personal data, a record of processing activities must be maintained, a legal basis (e.g., legitimate interest or consent) must be specified, and a defined retention period must be established. For publicly accessible shorteners, this may mean that tracking remains deactivated by default or is controlled via consent mechanisms. The implementation of such control structures is not part of the core functionality, but is an integral part of data protection-compliant operations.</p><p><strong><strong>8.2 Responsibility for forwarding</strong></strong></p><p>Another key point is the service provider’s responsibility for the content to which the link is redirected. Even if a shortener technically only implements a redirect, legal responsibility arises as soon as the impression arises that the operator endorses or controls the target content. This is especially true for public or embedded shorteners, such as those found in corporate portals or social platforms.</p><p>The challenge lies in distinguishing between technical neutrality and de facto mediation. It is therefore advisable to integrate legal protection mechanisms into the architecture, for example, through a policy that excludes the upload of specific domains, regular URL revalidation, or the use of abuse detection systems. In the event of misuse or complaints, immediate deactivation of individual mappings should be possible, ideally via a separate administration interface.</p><p>This responsibility is not only legally relevant but also has a reputational impact: Shorteners used to spread harmful content quickly lose their credibility – and possibly also their access to platforms or search engines.</p><p><strong><strong>8.3 Transparency and disclosure of the destination address</strong></strong></p><p>A common criticism of URL shorteners is that the destination address is no longer visible to the user. This limits the ability to evaluate whether a link is trustworthy before clicking on it. From an ethical perspective, this raises the question of whether a shortener should offer a pre-check option.</p><p>Technically, this can be achieved through a special preview mode, such as via an appendage, by explicitly calling an API or HTML preview page that transparently resolves the mapping, for example, a link like <strong><a href="https://short.ly/abc123+" rel="nofollow noopener" target="_blank">https://short.ly/abc123+</a>.</strong> Instead of redirecting immediately, the user first displays an information page that displays the original URL and redirects to the page if desired. This function can be supplemented with information about validity, access statistics, or trustworthiness.</p><p>A transparent approach to redirects not only increases user acceptance but also reduces the potential for abuse, especially among security-conscious target groups. In sensitive environments, a mandatory preview page – for example, for all non-authenticated users – can be a helpful measure.</p><p><strong><strong>9. Conclusion and outlook</strong></strong></p><p><strong><strong>9.1 Lessons Learned</strong></strong></p><p>The development of a URL shortener in pure Java, without frameworks or external libraries, has demonstrated how even seemingly trivial web services, upon closer inspection, reveal themselves to be complex systems with diverse requirements. From the basic function of address shortening to security aspects and operational and legal implications, the result is a system that must be architecturally well-structured, yet flexible and extensible.</p><p>The importance of a clear separation of responsibilities is particularly important: A stable mapping store, a deterministic encoder, a secure yet straightforward REST API, and understandable error handling form the backbone of a robust service. Modern language tools from Java 24, such as records, sealed types, and virtual threads, enable a remarkably compact, type-safe, and concurrency-capable implementation.</p><p>The conscious decision against frameworks not only maximised the learning effect but also contributed to a deeper understanding of HTTP, data storage, thread safety, and API design – a valuable perspective for developers who want to operate in a technology-independent environment.</p><p><strong><strong>9.2 Possible further developments (e.g. blockchain, DNSSEC)</strong></strong></p><p>Despite their apparent simplicity, URL shorteners represent a fascinating field for technological innovation. There are efforts to move away from centralised management of the mapping between short code and target URL, instead using decentralised technologies such as blockchain. In this case, each link is stored as a transaction, providing resistance to manipulation and historical traceability. In practice, however, this places high demands on latency and infrastructure, which is why such approaches have been used so far rarely in production.</p><p>Another development strand lies in integration with DNSSEC-based procedures. This not only signs the shortcode itself, but also cryptographically verifies the authenticity of the resolved host. This could combine trust and verification, especially in security-critical areas such as government services, banks, or certificate authorities.</p><p>AI-supported heuristics, such as those for misuse detection or memory cleanup prioritisation, also offer potential. However, the integration of such mechanisms requires a data-efficient, explainable design that is compatible with applicable data protection regimes.</p><p><strong><strong>9.3 Importance of URL shorteners in the context of digital sovereignty</strong></strong></p><p>In today’s digital landscape, URL shorteners are more than just a convenience feature; they are a valuable tool. They influence the visibility, accessibility, and traceability of content. The question of whether and how a link is modified or redirected has a direct impact on information sovereignty and transparency, and thus on digital sovereignty.</p><p>Especially in the public sector, educational institutions, or organisations with strict compliance requirements, URL shorteners should not be operated as outsourced cloud services; instead, they should be developed in-house or at least integrated in a controlled manner. A self-hosted solution not only allows complete control over data flows and access histories but also protects against censorship-like outages or data-driven tracking by third parties.</p><p>This makes the URL shortener, as inconspicuous as its function may seem, a strategic component of a trustworthy IT infrastructure. It exemplifies the question: Who controls the path of information? In this respect, a custom shortener is not just a tool, but also a statement of identity.</p><p>The next part will be about the implementation itself..</p><p>Happy Coding</p><p><a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://svenruppert.com/tag/architecture/" target="_blank">#Architecture</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://svenruppert.com/tag/design-pattern/" target="_blank">#DesignPattern</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://svenruppert.com/tag/java/" target="_blank">#Java</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://svenruppert.com/tag/security/" target="_blank">#security</a></p>