Back to Projects

DataDock

Self-hosted file sharing platform with advanced storage architecture, enterprise-grade security, and zero external dependencies

Role: Full-Stack Developer
Timeline: Apr 2025 - Present
Status: Live
DataDock project screenshot

01. Overview

DataDock is a powerful, self-hosted file sharing platform built from the ground up with zero external dependencies. Written in vanilla PHP 8+ with no frameworks, it provides enterprise-grade security and flexibility without the bloat of commercial solutions. The platform now includes nested folders, file tags, storage partitions, SHA-256 deduplication, hotlink monitoring, user authentication, drag-and-drop uploads, automatic thumbnail generation, guest upload capabilities, user quotas, an admin panel, and a one-click GitHub updater system. An official Docker image is published to Docker Hub; GitHub Actions builds and pushes the image on commits and releases so self-hosters can deploy with a pinned tag or latest.

The platform addresses the need for a secure, customizable file sharing solution that can be hosted on-premises or on private servers. Unlike cloud-based alternatives, DataDock gives organizations full control over their data, security policies, and customization options. It features configurable brute-force protection, hardened upload validation, file expiry management, quota enforcement (both file count and storage limits), partition-aware storage controls, and a comprehensive admin panel for site management.

02. Problems & Challenges

Building a secure, self-hosted file sharing platform presented several challenges:

  • Zero Dependencies Architecture: Building everything from scratch without external libraries or frameworks meant implementing core features manually. This included password hashing (using PHP's `password_hash`), secure random generation, brute-force protection, file validation, and thumbnail generation using only PHP's GD library.
  • Brute-Force Protection System: Implementing a configurable brute-force protection system that tracks failed login attempts per user, enforces lockout periods, and handles anonymous attempts required careful database design and time-based logic. The system needed to be configurable through the admin panel while maintaining security.
  • Hierarchical Organization System: Implementing nested folders with parent-child relationships while preserving dashboard filtering, breadcrumbs, folder-scoped uploads, and same-page actions required careful routing and query design to prevent navigation edge cases.
  • Quota & Partition Management: Implementing dual quota systems (file count and storage limits) for both registered users and guest uploads across partition-aware storage roots required efficient aggregation queries, pre-upload validation, and clear errors when limits are reached.
  • File Upload Security: Ensuring secure file uploads required validating file types via MIME type checking, preventing execution of dangerous file types (PHP, shell scripts, executables), validating multi-dot filenames (for example, `file.php.jpg`), scanning image and SVG bodies for embedded PHP tags, sanitizing filenames, and storing files with unique identifiers to prevent directory traversal attacks.
  • Automatic Thumbnail Generation: Implementing automatic thumbnail generation for images using PHP's GD library required handling various image formats, error handling for corrupted images, and efficient storage of thumbnails alongside original files.
  • Storage Deduplication: Implementing optional SHA-256 deduplication per storage partition required reference counting with safe create/delete flows so identical files share one physical blob while preserving correct user-level metadata.
  • One-Click Update System: Building a GitHub-based updater that fetches releases, downloads archives, extracts files, and updates the application while preserving user data and configuration required careful file system operations, error handling, and a dry-run mode for safety.
  • Hotlink Awareness: Adding optional referer-based hotlink logging for downloads, ZIP requests, thumbnails, and avatar requests required balancing useful abuse visibility with low false positives (same-site and empty referers are ignored).
  • UTC Time Management: Storing all timestamps in UTC and converting to local time on the frontend required consistent timezone handling across the application, JavaScript date conversion, and proper display formatting.
  • Flash Messaging System: Implementing a session-based flash messaging system that supports multiple success, error, and warning messages across all pages required careful session management and message cleanup to prevent message persistence issues.

Critical Challenge: Building Without Dependencies

One of the most significant challenges was building a production-ready application without any external dependencies. This meant implementing features that are typically handled by libraries: a custom Markdown parser for rendering changelogs, secure file upload handling, brute-force protection logic, quota calculation systems, and a GitHub API client for the updater. Every feature required understanding the underlying implementation details and security implications, rather than relying on battle-tested libraries.

03. Solutions & Implementation

I addressed these challenges through careful security practices and thoughtful implementation:

  • Secure Password Hashing: Used PHP's built-in `password_hash()` and `password_verify()` functions for secure password storage. This uses bcrypt by default, which is cryptographically secure and includes automatic salt generation.
  • Brute-Force Protection System: Implemented a configurable brute-force protection system that tracks failed login attempts per user in a `login_attempts` table. The system enforces lockout periods based on configurable thresholds (max attempts, lockout minutes, lockout window) and clears attempts on successful login. Anonymous attempts are tracked using SHA-256 hashed identifiers.
  • Prepared Statements: All database queries use PDO prepared statements to prevent SQL injection attacks. User inputs are sanitized using `htmlspecialchars()` and a custom `sanitize_data()` function to prevent XSS attacks.
  • File Upload Security: Implemented comprehensive file upload security: MIME type validation using `mime_content_type()`, forbidden extension list (PHP, executables, scripts), unique filename generation using `uniqid()`, and file size validation at multiple levels (PHP ini, application settings, frontend).
  • Folders, Tags, and Quota Enforcement: Implemented nested folders and optional tags while enforcing dual quota systems that check both file count and total storage before allowing uploads. Quotas are enforced for both registered users and guest uploads, with separate configurable limits and efficient SQL aggregation queries.
  • Automatic Thumbnail Generation: Implemented automatic thumbnail generation for images using PHP's GD library. The system detects image MIME types, creates 100px wide thumbnails, stores them in a separate directory, and handles errors gracefully for corrupted images.
  • Storage Partitions & Path Resolution: Added multi-root storage support with a centralized partition-aware path layer so uploads, downloads, ZIP generation, one-time links, and thumbnail handling resolve to the correct disk root consistently.
  • SHA-256 Deduplication: Built optional per-partition deduplication using `storage_objects` reference tracking so duplicate content reuses a single file on disk while deletes and purges safely decrement references.
  • One-Click GitHub Updater: Built a complete updater system that fetches the latest release from GitHub's API, downloads ZIP or TAR.GZ archives, extracts files, preserves user data and configuration, and includes a dry-run mode for safe testing. The system uses cURL for downloads and includes comprehensive error handling.
  • Custom Markdown Parser: Implemented a custom Markdown parser for safely rendering GitHub release notes and changelogs. The parser supports headings, bold, italic, code blocks, links, lists, and horizontal rules while sanitizing HTML to prevent XSS attacks.
  • UTC Time Storage: All timestamps are stored in UTC in the database. The frontend converts UTC timestamps to local time using JavaScript's Date API, ensuring consistent time display across different timezones.
  • Hotlink Monitoring: Implemented optional referer-aware event logging for download surfaces, with trusted-host allowlists and admin log visibility for detecting external embedding and scraping activity.
  • Config Folder Protection: Secured the `config/` directory using `.htaccess` to prevent direct access to database credentials and settings files, ensuring sensitive configuration data cannot be accessed via HTTP.

Brute-Force Protection Implementation

Here's how I implemented the configurable brute-force protection system:

login.php
// Check failed attempts within lockout window
if ($bruteEnabled) {
    $windowStart = $now->modify("-{$lockoutWindow} minutes")
        ->format('Y-m-d H:i:s');
    
    $stmt = $pdo->prepare(
        "SELECT COUNT(*) FROM login_attempts 
         WHERE user_id = ? AND success = 0 AND attempted_at > ?"
    );
    $stmt->execute([$userId, $windowStart]);
    $failedAttempts = $stmt->fetchColumn();
    
    // Enforce lockout if threshold exceeded
    if ($failedAttempts >= $maxAttempts) {
        // Calculate lockout expiration
        $lockoutUntil = $lastAttemptTime
            ->modify("+{$lockoutMinutes} minutes");
        
        if ($now < $lockoutUntil) {
            // User is still locked out
            return "Too many failed attempts";
        }
    }
}

// On successful login, clear failed attempts
if (password_verify($password, $user['password_hash'])) {
    if ($bruteEnabled) {
        $pdo->prepare(
            "DELETE FROM login_attempts WHERE user_id = ?"
        )->execute([$userId]);
    }
    // Set session and redirect
}

04. Key Learnings

This project taught me valuable lessons about security, data architecture, and building from scratch:

  • Security First Mindset: I learned to think about security from the beginning of development, not as an afterthought. Every feature needs to be designed with security in mind, from file uploads to authentication to configuration file protection.
  • Building Without Dependencies: Working without external libraries or frameworks forced me to understand the underlying implementations of common features. I implemented a custom Markdown parser, brute-force protection logic, quota calculation systems, and a GitHub API client. This deep understanding helps when choosing and using libraries in other projects.
  • PHP Native Functions: I learned to leverage PHP's built-in security functions like `password_hash()`, `password_verify()`, `random_bytes()`, `htmlspecialchars()`, and `mime_content_type()`. Understanding what PHP provides natively helps avoid unnecessary dependencies while maintaining security.
  • Database Design for Security: I learned to design database schemas that support security features like tracking login attempts, storing timestamps in UTC, and maintaining referential integrity. The `login_attempts` table design allows tracking both user-specific and anonymous attempts.
  • File System Security: Implementing secure file uploads taught me about MIME type validation, filename sanitization, unique file naming, and preventing directory traversal attacks. I learned to never trust user-provided filenames and to store files with generated identifiers.
  • Storage Architecture at Scale: Adding folders, storage partitions, and optional deduplication taught me to separate logical file records from physical storage objects. This design keeps user workflows simple while enabling lower disk usage and safer cleanup operations.
  • Quota System Design: Implementing quota systems taught me about efficient database aggregation queries, pre-validation before file processing, and providing clear user feedback when limits are reached. The system needs to be fast (check before upload) and accurate (prevent race conditions).
  • Session Management: Working with PHP sessions taught me about flash messaging systems, session cleanup, and preventing message persistence issues. I learned to clear flash messages after display and handle session state carefully.
  • Time Zone Handling: Implementing UTC storage with frontend conversion taught me about timezone consistency, JavaScript Date API, and the importance of storing all timestamps in UTC to avoid timezone-related bugs.
  • Threat Visibility Matters: Beyond prevention, I learned the value of operational visibility. Hotlink logging gives administrators practical evidence of abuse patterns without collecting unnecessary user telemetry.
  • Admin Panel Architecture: Building a comprehensive admin panel taught me about role-based access control, sectioned interfaces, and maintaining separation between admin and user functionality while sharing common code.

Major Takeaway

The biggest lesson was understanding that building without dependencies doesn't mean reinventing the wheel - it means understanding what the wheel does and implementing it correctly. I learned to leverage PHP's native security functions, understand cryptographic principles, and implement features from first principles. This project significantly improved my understanding of web application security, database design, file system operations, and the importance of building maintainable, secure code without external dependencies. The experience of implementing a GitHub updater, Markdown parser, and brute-force protection from scratch gave me deep insights into how these systems work under the hood.

05. Tech Stack

Backend

PHP 8+ MySQL 8 PDO GD Library

Frontend

JavaScript (Vanilla) HTML5 CSS3 XMLHttpRequest

Security

password_hash() Prepared Statements Brute-Force Protection Upload Hardening MIME Validation .htaccess

Features

cURL GitHub API Markdown Parser Session Management Storage Partitions SHA-256 Deduplication Hotlink Monitoring

Deployment

Docker Docker Hub GitHub Actions

06. Results & Impact

DataDock successfully provides a secure, self-hosted file sharing solution with zero external dependencies:

  • User Authentication System: Implemented secure user registration, login, and session management with PHP's `password_hash()` for password storage. The system supports login via username or email and includes configurable brute-force protection with lockout periods and attempt tracking.
  • File Upload System: Built a comprehensive file upload system with drag-and-drop support, multiple file uploads, automatic thumbnail generation for images, file expiry dates, and AJAX upload progress tracking. The system validates file types via MIME checking, blocks executable multi-dot extension abuse, and scans image/SVG content for embedded PHP payloads.
  • Folders & Tags: Added nested folder organization with breadcrumbs, folder-scoped uploads, move-file actions, and optional tagging so users can manage large file libraries with better structure and faster discovery.
  • Quota Management: Implemented dual quota systems (file count and storage limits) for both registered users and guest uploads. Quotas are enforced before upload processing, with clear error messages when limits are reached. The system efficiently calculates current usage using SQL aggregation.
  • Guest Upload Support: Implemented optional guest upload functionality with cookie-based guest identification, separate quota limits, and the ability to enable/disable guest uploads through the admin panel.
  • Admin Panel: Built a comprehensive admin panel with multiple sections: site settings (site name, registration, file size limits, brute-force configuration), user management (change roles, delete users), file management (view all files, purge expired files with statistics), and site reset functionality.
  • One-Click GitHub Updater: Implemented a complete updater system that fetches the latest release from GitHub's API, downloads and extracts archives, preserves user data and configuration, and includes a dry-run mode for safe testing. The system handles both ZIP and TAR.GZ formats and includes comprehensive error handling.
  • Storage Partitions & Deduplication: Implemented multi-root storage partitions with optional SHA-256 deduplication per partition. Duplicate content can share a single physical blob while user records remain independent through reference counting.
  • Hotlink Monitoring: Added optional referer-aware logging for download, ZIP, thumbnail, and avatar requests from external hostnames, with trusted-host exceptions and admin visibility via the Hotlink Log section.
  • Changelog & Release Notes: Built a custom Markdown parser for safely rendering GitHub release notes and changelogs. The parser supports headings, bold, italic, code blocks, links, lists, and horizontal rules while sanitizing HTML to prevent XSS attacks.
  • Flash Messaging System: Implemented a session-based flash messaging system that supports multiple success, error, and warning messages across all pages. Messages are automatically cleared after display to prevent persistence issues.
  • UTC Time Management: All timestamps are stored in UTC in the database. The frontend converts UTC timestamps to local time using JavaScript, ensuring consistent time display across different timezones.
  • Security Features: Implemented comprehensive security including prepared statements for SQL injection prevention, input sanitization for XSS prevention, MIME type validation, forbidden file type blocking, config folder protection via `.htaccess`, and brute-force protection with configurable thresholds.
  • Installation System: Built a complete installation wizard that creates the database schema, sets up configuration files, creates the admin user, and secures the config directory. The installer includes validation and error handling.
  • Container distribution: The app ships as a public Docker image on Docker Hub (zacharykeatings/datadock), with GitHub Actions building and pushing images on push and release so deployments stay aligned with the repo.

The project showcases my ability to build secure, self-contained applications from scratch without external dependencies. It demonstrates deep understanding of PHP's native security functions, database design, file system operations, session management, and web application security best practices. The platform is production-ready and includes enterprise-grade features like quota management, brute-force protection, automatic updates, and comprehensive admin controls.

Screenshot