Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New media processing pipeline #680

Merged
merged 59 commits into from
Jan 25, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
bfb875b
Detect media files info using ImageMagick and Ffmpeg
davidmz Dec 17, 2024
ab673f9
Install ImageMagick and Ffmpeg in "checks" workflow machine
davidmz Dec 17, 2024
df2d082
Made filehandle.read call compatible with Node18
davidmz Dec 17, 2024
a0adead
Add files of type "general"
davidmz Dec 17, 2024
95304f2
Use ImageMagick instead of GraphicsMagick
davidmz Dec 23, 2024
d00a7d3
Detect/suggest the original file extension
davidmz Dec 24, 2024
3369e00
Detect rotation for images and video
davidmz Dec 25, 2024
0003efa
Add an algorithm to calculate the size of the preview images
davidmz Dec 26, 2024
05e1c6f
Fix rotated fixture file
davidmz Dec 29, 2024
0b11578
Add types for 'mime-types' package
davidmz Dec 29, 2024
2d19cf8
Use new Attachment fabric, update some tests
davidmz Dec 30, 2024
5723aea
Check the original JPEG for use as a largest preview
davidmz Dec 31, 2024
9bcc83a
Present old 'image_sizes' data in a new format
davidmz Dec 31, 2024
14fa3e4
Update the orientation tests
davidmz Jan 1, 2025
effcafa
Use Attachment.create in user deletion test
davidmz Jan 1, 2025
c0f6924
Refactor the S3 emulation in tests
davidmz Jan 2, 2025
75b92b4
Use Attachment.create everywhere
davidmz Jan 2, 2025
c78578b
Update geometry tests
davidmz Jan 2, 2025
ece30c4
Refactor some path-related work
davidmz Jan 2, 2025
641b0fc
Emit the files list for 'general' file type
davidmz Jan 2, 2025
a914e17
Fix Blob construction
davidmz Jan 2, 2025
ce19809
Remove obsolete test of WebP attachment
davidmz Jan 2, 2025
4bd2ae4
Rewrite the legacy attachment serializer
davidmz Jan 3, 2025
9b65853
Update the createMockAttachmentAsync method
davidmz Jan 3, 2025
0da0f8b
Update OpenGraph image handling
davidmz Jan 3, 2025
858f971
Use the maxSizedVariant helper
davidmz Jan 3, 2025
8be54ab
Check the required attachment subdirectories on server start
davidmz Jan 3, 2025
04a3f87
Remove unused methods of Attachment
davidmz Jan 3, 2025
185353b
Use ImageMagick CLI instead of 'gm' package
davidmz Jan 3, 2025
a52326b
Remove graphicsmagick from the "checks" image
davidmz Jan 3, 2025
f3813e9
Add 'meta' column to the attachments table
davidmz Jan 3, 2025
6fa36ba
Handle multi-frame images properly
davidmz Jan 4, 2025
9041ca0
Extract additional info from AVC files
davidmz Jan 7, 2025
07ebb2b
Add geometry calculations for video
davidmz Jan 8, 2025
480427f
Create previews subdirectories in runtime
davidmz Jan 8, 2025
88d768b
Change the spawnAsync args type, allow to use array of arrays
davidmz Jan 9, 2025
5d6975c
Use smaller video fixtures
davidmz Jan 10, 2025
6427db1
Add video processing (synchronous, for now)
davidmz Jan 10, 2025
04e9b92
Update dockerfile instructions
davidmz Jan 10, 2025
103c9af
Add "width", "height" and "duration" fields to the attachments table
davidmz Jan 10, 2025
c89b41d
Serialize animated images
davidmz Jan 10, 2025
ad99156
Better detect ffprobe errors
davidmz Jan 11, 2025
0283e92
Refactor some types
davidmz Jan 11, 2025
10a5fc5
Early exit on error in ffmpeg
davidmz Jan 11, 2025
cd756fa
Add 'silent' field to attachment metadata
davidmz Jan 11, 2025
0cff487
Allow to limit the number of simultaneous executions for some job types
davidmz Nov 28, 2024
69e40dd
Process video files using deferred job
davidmz Jan 12, 2025
b87ee73
Add image/avif to the inline mime types
davidmz Jan 13, 2025
f17eb2d
Add tests for the file extensions and Content-Disposition's
davidmz Jan 13, 2025
a83133a
Add realtime notification on attachment update
davidmz Jan 14, 2025
41afd8d
Add new (v4) attachment serializer and 'GET /attachments/:attId' method
davidmz Jan 17, 2025
dccca74
Don't check every attachments subdirs on start
davidmz Jan 17, 2025
c0e18bb
Update changelog and api_versions file
davidmz Jan 17, 2025
65517a4
Add a new `GET /vN/attachments/:attId/:type` API endpoint
davidmz Jan 22, 2025
9ec8813
Add a fileSizeLimitByType config option
davidmz Jan 24, 2025
bfd76e1
Limit the user's media in the processing queue
davidmz Jan 24, 2025
686576e
Use 'image' size limit in bookmarklet controller
davidmz Jan 24, 2025
4cc79ed
Use .tmp file for files in progress, add 'inProgress' flag to the v2 …
davidmz Jan 25, 2025
005ca74
Remove packages that are not in use anymore
davidmz Jan 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Detect media files info using ImageMagick and Ffmpeg
  • Loading branch information
davidmz committed Jan 7, 2025
commit bfb875bad7e6cf6f14916700ac8dd0ab79a20b97
177 changes: 177 additions & 0 deletions app/support/media-files/detect.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
import util from 'util';
import { open } from 'fs/promises';

import gm from 'gm';

import { spawnAsync } from '../spawn-async';

import { FfprobeResult, MediaInfo, MediaInfoVideo } from './types';

const im = gm.subClass({ imageMagick: true });

export async function detectMediaType(file: string): Promise<MediaInfo> {
// Check by file signature
const probablyImage = await hasImageSignature(file);

if (probablyImage) {
// Identify using ImageMagick
const image = im(file);
const identifyAsync = util.promisify<string, string>(image.identify);

try {
const info = await identifyAsync.call(image, '%m %w %h');
const parts = info.split(' ');
const fmt = parts[0].toLowerCase();

// Animated images? Only GIF is supported for now
if (fmt === 'gif') {
const data = await detectAnimatedImage(file);

if (data) {
return { ...data, isAnimatedImage: true };
}
}

return {
type: 'image',
format: fmt,
width: parseInt(parts[1], 10),
height: parseInt(parts[2], 10),
};
} catch {
return { type: 'general' };
}
}

// Identify other types using ffprobe
try {
const { format, streams } = await runFfprobe(file);
const fmt = format.format_name.split(',')[0].toLowerCase();

const videoStream = streams.find((s) => s.codec_type === 'video');
const audioStream = streams.find((s) => s.codec_type === 'audio');

if (videoStream && format.duration) {
return {
type: 'video',
format: fmt,
vCodec: videoStream.codec_name,
aCodec: audioStream?.codec_name,
duration: parseFloat(format.duration),
width: videoStream.width!,
height: videoStream.height!,
tags: format.tags,
};
} else if (audioStream && format.duration) {
return {
type: 'audio',
format: fmt,
aCodec: audioStream.codec_name,
duration: parseFloat(format.duration),
tags: format.tags,
};
}

return { type: 'general' };
} catch {
return { type: 'general' };
}
}

async function detectAnimatedImage(file: string): Promise<MediaInfoVideo | null> {
const { format, streams } = await runFfprobe(file);
const fmt = format.format_name.split(',')[0].toLowerCase();

const videoStream = streams.find((s) => s.codec_type === 'video');

if (
videoStream &&
format.duration &&
videoStream.nb_frames &&
parseInt(videoStream.nb_frames) > 1
) {
return {
type: 'video',
format: fmt,
vCodec: videoStream.codec_name,
width: videoStream.width!,
height: videoStream.height!,
duration: parseFloat(format.duration),
};
}

return null;
}

async function runFfprobe(file: string): Promise<FfprobeResult> {
const out = await spawnAsync('ffprobe', [
'-hide_banner',
'-loglevel',
'warning',
'-show_format',
'-show_streams',
'-print_format',
'json',
'-i',
file,
]);
return JSON.parse(out.stdout) as FfprobeResult;
}

async function hasImageSignature(file: string): Promise<boolean> {
const fh = await open(file, 'r');

try {
const fileHead = new Uint8Array(16);
await fh.read(fileHead);
return checkImageSignature(fileHead);
} finally {
await fh.close();
}
}

function isStartsWith(buf: Uint8Array, codes: number[] | string, offset = 0): boolean {
let prefix: Uint8Array;

if (typeof codes === 'string') {
prefix = new TextEncoder().encode(codes);
} else {
prefix = new Uint8Array(codes);
}

for (let i = 0; i < prefix.length; i++) {
if (buf[i + offset] !== prefix[i]) {
return false;
}
}

return true;
}

/**
* We support only those image types: JPEG/JFIF, PNG, WEBP, AVIF, GIF, HEIC and HEIF
*
* @see https://en.wikipedia.org/wiki/List_of_file_signatures
* @see https://legacy.imagemagick.org/api/MagickCore/magic_8c_source.html
*/
function checkImageSignature(x: Uint8Array): boolean {
return (
// JPEG variants
isStartsWith(x, [0xff, 0xd8, 0xff]) ||
// PNG
isStartsWith(x, [0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]) ||
// WEBP
(isStartsWith(x, 'RIFF') && isStartsWith(x, 'WEBP', 8)) ||
// GIF
isStartsWith(x, 'GIF87a') ||
isStartsWith(x, 'GIF89a') ||
// HEIC/HEIF
isStartsWith(x, 'ftypheic', 4) ||
isStartsWith(x, 'ftypheix', 4) ||
isStartsWith(x, 'ftypmif1', 4) ||
// AVIF
isStartsWith(x, 'ftypavif', 4) ||
// The end
false
);
}
58 changes: 58 additions & 0 deletions app/support/media-files/types.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
export type MediaInfoVisual = {
width: number;
height: number;
};

export type MediaInfoPlayable = {
duration: number;
};

export type MediaInfoCommon = {
tags?: Record<string, string>;
};

export type MediaInfoImage = {
type: 'image';
format: string;
} & MediaInfoVisual &
MediaInfoCommon;

export type MediaInfoVideo = {
type: 'video';
format: string;
vCodec: string;
aCodec?: string;
isAnimatedImage?: true;
} & MediaInfoVisual &
MediaInfoPlayable &
MediaInfoCommon;

export type MediaInfoAudio = {
type: 'audio';
format: string;
aCodec: string;
} & MediaInfoPlayable &
MediaInfoCommon;

export type MediaInfoGeneral = {
type: 'general';
} & MediaInfoCommon;

export type MediaInfo = MediaInfoImage | MediaInfoVideo | MediaInfoAudio | MediaInfoGeneral;

export type Stream = { codec_name: string } & (
| {
codec_type: 'video';
width: number;
height: number;
nb_frames: string;
}
| {
codec_type: 'audio';
}
);

export type FfprobeResult = {
format: { format_name: string; duration: string; tags?: Record<string, string> };
streams: Stream[];
};
34 changes: 34 additions & 0 deletions app/support/spawn-async.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import { spawn, type SpawnOptionsWithoutStdio } from 'child_process';

/**
* Spawns a child process and returns a promise resolving with its output
*
* @param {string} command - The command to run.
* @param {Array<string>} args - List of string arguments.
* @param {SpawnOptionsWithoutStdio} options - Options to pass to the spawn function.
* @returns {Promise<{stdout: string, stderr: string}>} - Promise that resolves with the output.
*/
export function spawnAsync(
command: string,
args: readonly string[] = [],
options: SpawnOptionsWithoutStdio = {},
): Promise<{ stdout: string; stderr: string }> {
return new Promise((resolve, reject) => {
const child = spawn(command, args, options);

let stdout = '';
let stderr = '';
child.stdout.on('data', (data) => (stdout += data.toString()));
child.stderr.on('data', (data) => (stderr += data.toString()));

child.on('close', (code) => {
if (code === 0) {
resolve({ stdout, stderr });
} else {
reject(new Error(`Process exited with code ${code}\n${stderr}`));
}
});

child.on('error', (err) => reject(err));
});
}
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@
"@types/bytes": "~3.1.4",
"@types/cache-manager": "~4.0.6",
"@types/debug": "~4.1.12",
"@types/gm": "~1.25.4",
"@types/humps": "~2.0.6",
"@types/jsonwebtoken": "~9.0.6",
"@types/koa": "~2.15.0",
Expand Down
11 changes: 11 additions & 0 deletions test/fixtures/media-files/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
The "squirrel" image has CC0 license and taken from the Pixabay:
https://pixabay.com/photos/squirrel-tail-bushy-tail-forest-316426/

The "dodecahedron" image has CC license and taken from Wikimedia Commons:
https://commons.wikimedia.org/wiki/File:256-XX-dodecahedron.gif

The "music" audio has CC0 license and taken from Wikimedia Commons:
https://commons.wikimedia.org/wiki/File:Piermic_-_Improvisation_with_Sopranino_Recorder_(2015).mp3

The "polyphon" video has CC0 license and taken from Wikimedia Commons:
https://commons.wikimedia.org/wiki/File:Polyphon-_Kreuz-Polka.ogv
Binary file added test/fixtures/media-files/dodecahedron.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading