New Release

Launching Line 1.0

Lightweight state-of-the-art model to create doodle videos from images.

Line 1.0 Model

How Line 1.0 Works

This page gives a high-level technical explanation of Line 1.0. It is intentionally detailed enough for product and engineering understanding, while keeping proprietary tuning and optimization internals private.

1. Model Overview

Line 1.0 is a deterministic drawing-order model used to convert static instructional imagery into animated, hand-drawn explainers. The model is intentionally designed as a hybrid system: classical geometric vision for interpretability, structured sequencing for reading order consistency, and time-aware rendering for controllable motion. This choice gives us predictable outputs across education, product walkthroughs, and technical diagrams where repeatability matters more than one-shot visual randomness.

In practical terms, the model does three things: it identifies drawable structures, it ranks those structures into a human-readable sequence, and it reveals them with temporally stable stroke growth. The output is therefore not just an animated mask; it is a sequence that approximates how a person would progressively draw while explaining. This is especially useful for multi-part scenes where narrative flow and motion timing must stay aligned with accompanying voiceover and beat-level scripting.

We deliberately expose the conceptual model and mathematical framing while withholding private constants, fallback scoring strategies, and robustness heuristics that materially affect production reliability. This keeps the explanation transparent without disclosing sensitive internals.

2. Algorithm Overview

The algorithm starts with spatial normalization. Every source image is mapped into a fixed render canvas, preserving aspect ratio and centering the content. This avoids distortion while ensuring consistent downstream timing and stroke geometry. After normalization, a style-dependent binarization stage produces an ink map. Solid-style rendering prefers globally stable thresholding, while sketch and pencil styles use adaptive local thresholding to preserve subtle line structure under varying local contrast.

Once an ink map is available, connected components are extracted and each component is converted into one or more contour primitives. These primitives are treated as atomic draw candidates. The next stage applies geometric grouping: candidates are clustered into visual lines using center-distance and height-aware rules, then sorted left-to-right within each line. A controlled merge stage combines neighboring candidates into larger semantic units when spacing and alignment imply they should be drawn as one continuous token.

Temporal rendering is then applied per scene segment. Completed units are fully revealed, while the active unit is partially revealed according to local progress and contour-length budget. This produces the marker progression effect associated with whiteboard drawing, while preserving strict ordering constraints needed for legibility.

3. Mathematical Formulation (High-Level)

The resize and normalization stage uses a standard aspect-ratio preserving transform. Let the input size be (w, h) and target canvas be (W, H). The scale factor is:

s = min(W / w, H / h)

w' = s * w, h' = s * h

Background polarity is estimated from mean grayscale intensity. If I(x,y) is grayscale intensity over N pixels, we compute:

μ = (1 / N) Σ I(x,y)

This statistic controls inversion behavior for binary conversion in dark-versus-light backgrounds. For solid mode, threshold selection follows a variance-maximization principle (Otsu family objective); for local sketch-like modes, thresholding uses neighborhood-weighted Gaussian statistics.

Line grouping and merge decisions use geometry constraints. A candidate glyph g with centercy_g joins a line with average center cy_l when vertical deviation remains bounded:

|cy_g - cy_l| < 0.5 * max(h_g, h_l)

Horizontal merge uses gap testing between adjacent candidates:

gap = x_next - (x_curr + w_curr)

merge if gap < 1.2 * h_avg and gap > -0.5 * h_avg

Segment assignment for multi-part beats uses center-position partitioning:

section_idx = floor((x + w / 2) / section_width)

Rendering progress is computed over segment time with normalized interpolation:

p = (t - t_start) / (t_end - t_start)

i = floor(p * G), r = p * G - i

L_target = r * L_total

Here, G is the number of drawable units in the segment and L_total is total contour length for the active unit. This formulation ensures smooth intra-unit growth while preserving exact inter-unit ordering.

We intentionally omit private calibration constants, resilience heuristics, and ranking tie-break logic. Those details are part of model hardening and are not published on this page.

Domain Coverage

Line 1.0 is designed to be domain-flexible rather than domain-locked. It works best on structured visual content where information is conveyed through shape hierarchy, labels, arrows, and iconography. In education, this includes science diagrams, mathematical concept sketches, and process explanations. In enterprise settings, common inputs include SOP visuals, safety instructions, compliance walkthroughs, and onboarding diagrams. For marketing and product communication, the model supports storyboard-like assets, feature breakdowns, and concept illustrations where controlled draw order improves audience retention.

The underlying sequencing logic is intentionally content-agnostic: as long as the image contains coherent drawable structures, the model can produce a stable animation path. This means teams can standardize one animation pipeline across multiple departments instead of maintaining separate tooling for technical, educational, and promotional content workflows.

Output Style and Format

The output aesthetic is deliberately aligned with whiteboard and doodle communication styles: progressive stroke reveal, high readability, and clean visual pacing. Rather than photorealistic interpolation, Line 1.0 focuses on narrative legibility. This makes outputs easier to follow in presentations, mobile feeds, and narrated explainers where viewers need to understand sequence, not just final composition.

In production, outputs are generated as standard video assets suitable for web and social channels, while preserving deterministic behavior for repeated runs on the same input profile. The style modes (solid, normal, pencil) allow teams to choose between cleaner high-contrast rendering and more sketch-like expressive rendering depending on brand tone and instructional context.

Representative Use Cases

Education and training: Convert static lesson graphics into time-synchronized explainers so learners see concept construction step by step.

Product and engineering demos: Animate architecture diagrams and feature maps to clarify component relationships without requiring live whiteboarding.

Corporate communication: Turn policy and process visuals into short walkthrough videos for onboarding, internal enablement, and operations updates.

Marketing storytelling: Use draw-order animation to reveal value propositions progressively, improving clarity in short-form campaigns.

For security and quality reasons, we describe use-case patterns and system behavior at a product level here, while withholding private optimization details and internal failure-handling strategies.