← Back to Blog

Adding Multi-File Type Support to the Blog Encryption System

The Problem

In previous posts, we built an end-to-end encryption system for the blog: Markdown text is encrypted with AES-256-GCM, images are encrypted separately into .enc files, and everything is decrypted and rendered in the browser. This setup works perfectly for standard blog posts, but it has one fundamental limitation—it only handles .md files.

In practice, a lot of content isn't a natural fit for Markdown:

  • A research paper draft is already a PDF.
  • An experiment report is most intuitive as a Jupyter Notebook.
  • A tech talk presentation is a PPTX file.
  • Mathematical derivations written in LaTeX lose formatting when converted to Markdown.

Forcing a conversion to Markdown would destroy layout and interactivity, while treating them as simple attachments ruins the "read online" experience. I need the encryption system to natively support these file types.

Design: A Type-Based Dispatch System

Since a one-size-fits-all process is impossible, the solution is to dispatch based on file type. The core design is this: at build time, perform different preprocessing steps based on the file type, encrypt everything into a standard JSON format with a type flag, and on the client-side, choose a rendering method based on that flag.

File Type Extension Build-Time Action Client-Side Rendering
Text .md Encrypt raw Markdown marked.js → HTML
Notebook .ipynb Convert to HTML → Encrypt Inject HTML directly
Native PDF .pdf Encrypt raw binary Blob URL + <iframe>
Convertible .tex .docx .pptx etc. Convert to PDF → Encrypt binary Same as PDF

The key takeaway: All file types share the same two-layer key system (PBKDF2 + AES-256-GCM). The changes are confined to the preprocessing and rendering stages, leaving the encryption core untouched.

Build Time (Node.js)
══════════════════════════════════════════════════════════════

  .md ─────────────────────→ Encrypt text    ─→ slug.json (type: markdown)
  .ipynb ──→ convertNotebook → Encrypt HTML   ─→ slug.json (type: html)
  .pdf ────────────────────→ Encrypt binary   ─→ slug.json (type: pdf)
  .tex ───→ pdflatex ──┐
  .pptx ──→ soffice ───┤──→ Encrypt binary   ─→ slug.json (type: pdf)
  .docx ──→ soffice ───┘

Client-Side
══════════════════════════════════════════════════════════════

  slug.json ──→ User enters password ──→ Decrypt slot with PBKDF2 ──→ Decrypt with AES-GCM

                    ┌───────────────────────────────────────────────────┼──────────────┐
                    ▼                                                   ▼              ▼
              type: markdown                                      type: html     type: pdf
              marked.parse()                                      Directly inject  Blob URL
                    ▼                                                   ▼              ▼
               Render HTML                                       Render notebook   <iframe>

Build-Time Implementation

File Discovery: From Markdown to All Types

The first step is to teach the build script to recognize new file types. The original findMarkdownFiles is upgraded to findEncryptableFiles:

const TEXT_TYPES = ['.md'];
const NOTEBOOK_TYPES = ['.ipynb'];
const PDF_TYPE = ['.pdf'];
const PDF_CONVERTIBLE = ['.pptx', '.docx', '.rtf', '.tex', '.odt', '.ods', '.odp'];
const ALL_SUPPORTED = [...TEXT_TYPES, ...NOTEBOOK_TYPES, ...PDF_TYPE, ...PDF_CONVERTIBLE];

When scanning the _content/crypt-posts/ directory, the script now matches all files with extensions in ALL_SUPPORTED.

Metadata Source: Frontmatter vs. .meta.json

Encryption settings for Markdown files are defined in YAML frontmatter, which is fine. But binary files don't have frontmatter, so we'll establish a convention: non-Markdown files will use a corresponding .meta.json file for their metadata.

_content/crypt-posts/gol/
├── alife-paper-draft.pdf           # PDF source file
├── alife-paper-draft.meta.json     # Metadata
├── 2026-02-25-cross-experiment-en.md   # Markdown (with built-in frontmatter)

The format of .meta.json mirrors the fields in the frontmatter:

{
  "title": "Paper Draft",
  "date": "2026-02-26",
  "encrypted": [{ "key": "gol", "hint": ["fenda"] }],
  "series": "gol"
}

The build script's getEncryptedMeta() function unifies the logic for reading from both sources:

function getEncryptedMeta(filePath) {
    const ext = path.extname(filePath).toLowerCase();

    if (ext === '.md') {
        // Markdown: Read from frontmatter
        const { frontmatter, body } = parseFrontmatter(fs.readFileSync(filePath, 'utf-8'));
        if (!frontmatter.encrypted) return null;
        return { encrypted: frontmatter.encrypted, body, ext };
    }

    // Other files: Read from .meta.json
    const metaPath = filePath.replace(new RegExp(`\\${ext}$`), '.meta.json');
    if (!fs.existsSync(metaPath)) return null;
    const meta = JSON.parse(fs.readFileSync(metaPath, 'utf-8'));
    if (!meta.encrypted) return null;
    return { encrypted: meta.encrypted, ext };
}

Content Preprocessing Dispatch

This is the most critical change on the build side—branching the processing logic based on the file extension ext:

if (TEXT_TYPES.includes(ext)) {
    // Markdown: Process image references + encrypt raw text
    const processedBody = processMarkdownImages(meta.body, filePath, articleKeyString, slugDir);
    contentBuffer = Buffer.from(processedBody, 'utf-8');
    contentType = 'markdown';

} else if (NOTEBOOK_TYPES.includes(ext)) {
    // Notebook: Convert to HTML at build time, then encrypt the HTML
    const html = convertNotebook(filePath);
    contentBuffer = Buffer.from(html, 'utf-8');
    contentType = 'html';

} else if (PDF_TYPE.includes(ext)) {
    // PDF: Encrypt the raw binary directly
    contentBuffer = fs.readFileSync(filePath);
    contentType = 'pdf';

} else if (PDF_CONVERTIBLE.includes(ext)) {
    // LaTeX/Office → Convert to PDF first → Then encrypt the binary
    const pdfPath = convertToPdf(filePath);
    contentBuffer = fs.readFileSync(pdfPath);
    contentType = 'pdf';
}

Format Conversion

We'll use pdflatex for LaTeX and LibreOffice in headless mode for Office documents:

function convertToPdf(filePath) {
    const ext = path.extname(filePath).toLowerCase();
    if (ext === '.tex') {
        execSync(`pdflatex -interaction=nonstopmode -output-directory="${tmpDir}" "${filePath}"`,
            { timeout: 60000, stdio: 'pipe' });
    } else {
        execSync(`soffice --headless --convert-to pdf --outdir "${tmpDir}" "${filePath}"`,
            { timeout: 60000, stdio: 'pipe' });
    }
}

The -interaction=nonstopmode flag prevents LaTeX from waiting for interactive input on compilation errors, and --headless allows LibreOffice to run in an environment without a GUI.

Changes to the Encryption Function

The function signature is generalized:

// Old: encryptPost(markdownBody, articleKey, passwords)
// New: encryptContent(contentBuffer, articleKey, passwords, contentType)

The output JSON now includes a type field, which the client-side uses to determine the rendering method:

{
  "type": "pdf",
  "content": { "iv": "...", "ciphertext": "..." },
  "keys": [{ "salt": "...", "iv": "...", "data": "...", "tag": "..." }]
}

If type is missing, the client defaults to markdown for backward compatibility with older posts.

Client-Side Implementation

tryDecrypt() Returns Content Type

The decryption function needs to read the type field from the encrypted JSON and return it. Notably, since PDF data is binary, it cannot be decoded with TextDecoder—we return the ArrayBuffer directly:

const contentType = encData.type || 'markdown';
if (contentType === 'pdf') {
    return { rawBuffer: decrypted, contentKeyB64, contentType };
}
return { plaintext: new TextDecoder().decode(decrypted), contentKeyB64, contentType };

renderDecryptedContent() Renders by Type

This new function is the client-side "dispatch center":

function renderDecryptedContent(result, article) {
    const type = result.contentType || 'markdown';

    if (type === 'pdf') {
        const blob = new Blob([result.rawBuffer], { type: 'application/pdf' });
        const url = URL.createObjectURL(blob);
        article.innerHTML = `<iframe src="${url}" style="width:100%;height:80vh;
            border:1px solid #ddd;border-radius:6px;" type="application/pdf"></iframe>`;
        return null;
    }

    if (type === 'html') {
        article.innerHTML = encNotebookCss + result.plaintext;
        return result.plaintext;
    }

    // Default: markdown
    article.innerHTML = marked.parse(result.plaintext);
    return result.plaintext;
}

For PDFs, we use Blob + URL.createObjectURL to create a temporary, in-memory URL. The decrypted data never leaves the browser's memory, maintaining end-to-end security.

Session Cache Support for PDF Binaries

sessionStorage can only store strings, not an ArrayBuffer. For PDFs, we first encode the binary data as Base64 before storing it:

if (type === 'pdf') {
    const bytes = new Uint8Array(result.rawBuffer);
    let binary = '';
    for (let i = 0; i < bytes.length; i++) binary += String.fromCharCode(bytes[i]);
    sessionStorage.setItem(key, JSON.stringify({
        content: btoa(binary), contentType: 'pdf', ts: Date.now()
    }));
}

And decode it when reading from the cache:

if (cached.contentType === 'pdf') {
    const binary = atob(cached.content);
    const bytes = new Uint8Array(binary.length);
    for (let i = 0; i < binary.length; i++) bytes[i] = binary.charCodeAt(i);
    cachedResult.rawBuffer = bytes.buffer;
}

This way, users don't have to re-enter their password if they refresh the page within the same session.

Build-Time Results

$ node _tools/build-encrypted-posts.js
Encrypting: posts/gol/2026-02-25-cross-experiment-en (.md)
  -> encrypted image: gol/.../cross_competition_dynamics.png.enc
  -> blog/encrypted/gol/2026-02-25-cross-experiment-en.json [markdown] (1 key slots)
Encrypting: posts/2026/2026-02-26-encryption-test-latex (.tex)
  -> blog/encrypted/2026/2026-02-26-encryption-test-latex.json [pdf] (1 key slots)
Encrypting: posts/gol/alife-paper-draft (.pdf)
  -> blog/encrypted/gol/alife-paper-draft.json [pdf] (1 key slots)

Encrypted 16 post(s).

The LaTeX file is compiled to a PDF by pdflatex at build time, encrypted, and then rendered in an iframe on the client-side, preserving all formulas and formatting perfectly.

Summary

This update extends the encryption system from "Markdown-only" to "encrypt almost anything." The design principles are preprocess-dispatch, unify-encrypt, and render-dispatch:

  1. At build time, apply different preprocessing based on file type (leave text as-is, convert notebooks to HTML, convert others to PDF).
  2. The encryption core remains completely unchanged, with all types sharing the PBKDF2 + AES-256-GCM two-layer key system.
  3. On the client-side, select a rendering method based on the type field (marked.js / direct injection / iframe).

Not a single line of the underlying encryption logic was modified, so the security model remains as strong as before. The new build dependencies (pdflatex, LibreOffice) are only required if you use the corresponding file types—if you stick to Markdown and PDFs, you don't need to install anything new.