Skip to content

Latest commit

 

History

History
469 lines (340 loc) · 32.4 KB

File metadata and controls

469 lines (340 loc) · 32.4 KB

Library usage

To use patreon-dl in your own project:

import PatreonDownloader from 'patreon-dl';

const url = '....';

const downloader = await PatreonDownloader.getInstance(url, [options]);

await downloader.start();

Here, we first obtain a downloader instance by calling PatreonDownloader.getInstance(), passing to it the URL we want to download from (one of the supported URL formats) and downloader options, if any.

Then, we call start() on the downloader instance to begin the download process. The start() method returns a Promise that resolves when the download process has ended.

Downloader options

An object with the following properties (all optional):

Option Description
cookie Cookie to include in requests; required for accessing patron-only content. See How to obtain Cookie.
useStatusCache Whether to use status cache to quickly determine whether a target that had been downloaded before has changed since the last download. Default: true
stopOn Sets the condition to stop the downloader. Values can be:
  • never: do not stop; run till the end.
  • previouslyDownloaded: stop on encountering a post or product that was previously downloaded. Requires enabling useStatusCache.
  • publishDateOutOfRange: stop on encountering a post or product published outside the date range set by include.postsPublished or include.productsPublished (as the case may be).
Deprecated values:
  • postPreviouslyDownloaded: replaced by previouslyDownloaded.
  • postPublishDateOutOfRange: replaced by publishDateOutOfRange.
Default: never
pathToFFmpeg Path to ffmpeg executable. If not specified, ffmpeg will be called directly when needed, so make sure it is in the PATH.
pathToDeno Path to deno executable. This is used by the built-in YouTube downloader to sandbox executed code at runtime. If not specified, deno will be called directly when needed. If deno is not found, the built-in downloader will evaluate code without sandboxing.
pathToYouTubeCredentials Path to file storing YouTube credentials for connecting to a YouTube account when downloading embedded YouTube videos. Its purpose is to allow YouTube Premium accounts to download videos at higher than normal qualities. For more information, see Configuring YouTube connection.
maxVideoResolution Maximum video resolution to download (height in pixels). Only applies to Patreon-hosted videos and embedded YouTube videos when using the built-in YouTube downloader. If not set, or value is 0, negative or null, the best available quality will be downloaded.
outDir Path to directory where content is saved. Default: current working directory
dirNameFormat How to name directories: (object)
filenameFormat Naming of files: (object)
include What to include in the download: (object)
  • lockedContent: whether to process locked content. Default: true
  • postInTier: see Filtering posts by tier
  • postsWithMediaType: sets the media type criteria for downloading posts. Values can be:
    • any: download posts regardless of the type of media they contain. Also applies to posts that do not contain any media.
    • none: only download posts that do not contain media.
    • Array<image | video | audio | attachment | podcast>: only download posts that contain the specified media type(s).
    Default: any
  • postsPublished: sets the publish date range for posts. Its value is an object with two properties: after and before (both a DateTime object). You can specify one or both to indicate an open-ended or closed date range.
  • productsPublished: same as postsPublished, but applies to downloading products.
  • campaignInfo: whether to save campaign info. Default: true
  • contentInfo: whether to save content info. Default: true
  • contentMedia: the type of content media to download (images, videos, audio, attachments, excluding previews). Values can be:
    • true: download all content media.
    • false: do not download content media.
    • Array<image | video | audio | attachment | file>: only download the specified media type(s).
    Default: true
  • previewMedia: the type of preview media to download, if available. Values can be:
    • true: download all preview media.
    • false: do not download preview media.
    • Array<image | video | audio>: only download the specified media type(s).
    Default: true
  • protectedMedia: download videos even if they are protected by DRM. Keep in mind the downloaded videos will not play properly since there is no decryption key. Default: false
  • allMediaVariants: whether to download all media variants, if available. If false, only the best quality variant will be downloaded. Default: false
  • mediaThumbnails: whether to download thumbnails if available. Default: true
  • mediaByFilename: sets the filename pattern to match against media files. Those that do not match the provided pattern will not be downloaded. Its value is an object with three properties corresponding to the media types supported: images, audio and attachments. For example, to download only ZIP attachments, you would set the option like this:

    mediaByFilename: { attachments: '*.zip' }

    Internally, pattern matching is done by minimatch, which supports glob patterns. By default, patterns are case-sensitive. To ignore case, start the pattern with !.
  • comments: whether to fetch and save post comments.
request Network request options: (object)
  • userAgent: sets the User-Agent header to use when making requests. If not provided, a default User-Agent will be used.
  • maxRetries: maximum number of retries if a request or download fails. Default: 3
  • maxConcurrent: maximum number of concurrent downloads. Default: 10
  • minTime: minimum time to wait between starting requests or downloads (milliseconds). Default: 333
  • proxy: sets the proxy to use. Supports HTTP, HTTPS, SOCKS4 and SOCKS5 protocols. Value can be null (default - no proxy) or object with the following properties:
    • url: URL of proxy server, adhering to scheme: protocol://[username:[password]]@hostname:port
    • rejectUnauthorizedTLS: when connecting to a proxy server through SSL/TLS, this option indicates whether invalid certificates should be rejected. If your proxy server uses self-signed certs, you would want to set this to false. Default: true

    Note: ffmpeg, which is required to download videos in streaming format, supports HTTP proxy only.
fileExistsAction What to do when a target file already exists: (object)
  • info: in the context of saving info (such as campaign or post info), the action to take when a file belonging to the info already exists. Default: saveAsCopyIfNewer
  • infoAPI: API data is saved as part of info. Because it changes frequently, and usually used for debugging purpose only, you can set a different action when saving an API data file that already exists. Default: overwrite
  • content: in the context of downloading content, the action to take when a file belonging to the content already exists. Default: skip

Supported actions:

  • overwrite: overwrite existing file.
  • skip: skip saving the file.
  • saveAsCopy: save the file under incremented filename (e.g. "abc.jpg" becomes "abc (1).jpg").
  • saveAsCopyIfNewer: like saveAsCopy, but only do so if the contents have actually changed.

embedDownloaders External downloader for embedded videos. See External downloaders.
logger See Logger
dryRun Run without writing files to disk (except logs, if any). Default: false

Campaign directory name format

Format to apply when naming campaign directories. A format is a string pattern consisting of fields enclosed in curly braces.

What is a campaign directory?

When you download content, a directory is created for the campaign that hosts the content. Content directories, which stores the downloaded content, are then placed under the campaign directory. If campaign info could not be obtained from content, then content directory will be created directly under outDir.

A format must contain at least one of the following fields:

  • creator.vanity
  • creator.name
  • creator.id
  • campaign.name
  • campaign.id

Characters enclosed in square brackets followed by a question mark denote conditional separators. If the value of a field could not be obtained or is empty, the conditional separator immediately adjacent to it will be omitted from the name.

Default: '{creator.vanity}[ - ]?{campaign.name}'
Fallback: 'campaign-{campaign.id}'

Content directory name format

Format to apply when naming content directories. A format is a string pattern consisting of fields enclosed in curly braces.

What is a content directory?

Content can be a post or product. A directory is created for each piece of content. Downloaded items for the content are placed under this directory.

A format must contain at least one of the following unique identifier fields:

  • content.id: ID of content
  • content.slug: last segment of the content URL

In addition, a format can contain the following fields:

  • content.name: post title or product name
  • content.type: type of content ('product' or 'post')
  • content.publishDate: publish date (ISO UTC format)

Characters enclosed in square brackets followed by a question mark denote conditional separators. If the value of a field could not be obtained or is empty, the conditional separator immediately adjacent to it will be omitted from the name.

Default: '{content.id}[ - ]?{content.name}'
Fallback: '{content.type}-{content.id}'

Media filename format

Filename format of a downloaded item. A format is a string pattern consisting of fields enclosed in curly braces.

A format must contain at least one of the following fields:

  • media.id: ID of the item downloaded (assigned by Patreon)
  • media.filename: can be one of the following, in order of availability:
    • original filename included in the item's API data; or
    • filename derived from the header of the response to the HTTP download request.

In addition, a format can contain the following fields:

  • media.type: type of item (e.g. 'image' or 'video')
  • media.variant: where applicable, the variant of the item (e.g. 'original', 'thumbnailSmall'...for images)
  • src.type: the type of the items's source: 'post', 'product', 'campaign' or 'collection'
  • src.id: the ID of the items's source
  • src.title: title of the item's source
  • src.date: the publish / creation date of the item's source

If media.variant is not included in the format, it will be appended to it if allMediaVariants is true.

Sometimes media.filename could not be obtained, in which case it will be replaced with media.id, unless it is already present in the format.

Characters enclosed in square brackets followed by a question mark denote conditional separators. If the value of a field could not be obtained or is empty, the conditional separator immediately adjacent to it will be omitted from the name.

Default: '{media.filename}'
Fallback: '{media.type}-{media.id}'

Filtering posts by tier

To download posts belonging to specific tier(s), set the include.postsInTier option. Values can be:

  • any: any tier (i.e. no filter)
  • Array of tier IDs (string[])

To obtain the IDs of tiers for a particular creator, first get the campaign through PatreonDownloader.getCampaign(), then inspect the rewards property:

const signal = new AbortSignal(); // optional
const logger = new MyLogger(); // optional - see Logger section
const campaign = await PatreonDownloader.getCampaign('johndoe', signal, logger);

// Sometimes a creator is identified only by user ID, in which case you would do this:
// const campaign = await PatreonDownloader.getCampaign({ userId: '80821958' }, signal, logger);

const tiers = campaign.rewards;
tiers.forEach((tier) => {
  console.log(`${tier.id} - ${tier.title}`);
});

See Campaign, Reward.

External downloaders

You can specify external downloaders for embedded videos / links. Each entry in the embedDownloaders option is an object with the following properties:

Proprety Description
provider Name of the provider of embedded content. E.g. youtube, vimeo (case-insensitive)
exec The command to run to download the embedded content

exec can contain fields enclosed in curly braces. They will be replaced with actual values at runtime:

Field Description
post.id ID of the post containing the embedded video
embed.provider Name of the provider
embed.provider.url Link to the provider's site
embed.url Link to the video page supplied by the provider
embed.subject Subject of the video
embed.html The HTML code that embeds the video player on the Patreon page
dest.dir The directory where the video should be saved

For example usage of exec, see example-embed.conf.

External downloaders are not subject to request.maxRetries and fileExistsAction settings. This is because patreon-dl has no control over the downloading process nor knowledge about the outcome of it (including where and under what name the file was saved).

Configuring YouTube connection

In its simplest form, the process of connecting patreon-dl to a YouTube account is as follows:

  1. Obtain credentials by having the user visit a Google page that links his or her account to a 'device' (which in this case is actually patreon-dl).
  2. Save the credentials, as a JSON string, to a file.
  3. Pass the path of the file to PatreonDownloader.getInstance()

To obtain credentials, you can use the YouTubeCredentialsCapturer class:

import { YouTubeCredentialsCapturer } from 'patreon-dl';

// Note: you should wrap the following logic inside an async
// process, and resolve when the credentials have been saved.

const capturer = new YouTubeCredentialsCapturer();

/**
 * 'pending' event emitted when verification data is ready and waiting
 * for user to carry out the verification process.
 */
capturer.on('pending', (data) => {
  // `data` is an object: { verificationURL: <string>, code: <string> }
  // Use `data` to provide instructions to the user:
  console.log(
    `In a browser, go to the following Verification URL and enter Code:

    - Verification URL: ${data.verificationURL}
    - Code: ${data.code}

    Then wait for this script to complete.`);
});

/**
 * 'capture' event emitted when the user has completed verification and the 
 * credentials have been relayed back to the capturer.
 */
capturer.on('capture', (credentials) => {
  // `credentials` is an object which you need to save to file as JSON string.
  fs.writeFileSync('/path/to/yt-credentials.json', JSON.stringify(credentials));
  console.log('Credentials saved!');
});

// When you have added the listeners, start the capture process.
capturer.begin();

Then, pass the path of the file to PatreonDownloader.getInstance():

const downloader = await PatreonDownloader.getInstance(url, {
  ...
  pathToYouTubeCredentials: '/path/to/yt-credentials.json'
});

You should ensure the credentials file is writable, as it needs to be updated with new credentials when the current ones expire. The process of renewing credentials is done automatically by the downloader.

Logger

Logging is optional, but provides useful information about the download process. You can implement your own logger by extending the Logger abstract class:

import { Logger } from 'patreon-dl';

class MyLogger extends Logger {

  log(entry) {
    // Do something with log entry
  }

  // Called when downloader ends, so you can
  // clean up the logger process if necessary.
  end() {
    // This is not an abstract function, so you don't have to
    // implement it if there is no action to be taken here. Default is
    // to resolve right away.
    return Promise.resolve();
  }
}

Each entry passed to log() is an object with the following properties:

  • level: info, debug, warn or error, indicating the severity of the log message.
  • originator: (string or undefined) where the message is coming from.
  • message: array of elements comprising the message. An element can be anything such as a string, Error or object.

Built-in loggers

The patreon-dl library comes with the following Logger implementations that you may utilize:

  • ConsoleLogger

    Outputs messages to the console:

    import { ConsoleLogger } from 'patreon-dl';
    
    const myLogger = new ConsoleLogger([options]);
    
    const downloader = await PatreonDownloader.getInstance(url, {
        ...
        logger: myLogger
    });
    
    

    options: (object)

    Option Description
    enabled Whether to enable this logger. Default: true
    logLevel

    info, debug, warn or error. Default: info

    Output messages up to the specified severity level.

    include What to include in log messages: (object)
    • dateTime: show date / time of log messages. Default: true
    • level: show the severity level. Default: true
    • originator: show where the messsage came from. Default: true
    • errorStack: for Errors, whether to show the full error stack. Default: false
    dateTimeFormat

    The pattern to format data-time strings, when include.dateTime is true.

    Date-time formatting is provided by dateformat library. Refer to the README of that project for pattern rules.

    Default: 'mmm dd HH:MM:ss'

  • FileLogger

    Like ConsoleLogger, but writes messages to file.

    import { FileLogger } from 'patreon-dl';
    
    const myLogger = new FileLogger(options);
    
    const downloader = await PatreonDownloader.getInstance(url, {
        ...
        logger: myLogger
    });
    

    options: all ConsoleLogger options plus the following:

    Option Description
    init

    Values that determine the name of the log file (object):

    • targetURL: The url passed to PatreonDownloader.getInstance()
    • outDir: Value of outDir specified in PatreonDownloader.getInstance() options, or undefined if none specified (in which case defaults to current working directory).
    • date: (optional) Date instance representing the creation date / time of the logger. Default: current date-time.

      You might want to provide this if you are creating multiple FileLogger instances and filenames are to be formatted with the date, otherwise the date-time part of the filenames might have different values.

    logDir

    Path to directory of the log file.

    The path can be a string pattern consisting of the following fields enclosed in curly braces:

    • out.dir: value of outDir provided in init (or the default current working directory if none provided).
    • target.url.path: the pathname of targetURL provided in init, sanitized as necessary.
    • datetime.<date-time format>: the date-time of logger creation, as represented by date in init and formatted according to <date-time format> (using pattern rules defined by the dateformat library).

    logFilename

    Name of the log file.

    The path can be a string pattern consisting of the following fields enclosed in curly braces:

    • target.url.path: the pathname of targetURL provided in init, sanitized as necessary.
    • datetime.<date-time format>: the date-time of logger creation, as represented by date in init and formatted according to <date-time format> (using pattern rules defined by the dateformat library).

    Default: '{datetime.yyyymmdd}-{log.level}.log'

    fileExistsAction

    What to do if log file already exists? One of the following values:

    • append: append logs to existing file
    • overwrite: overwrite the existing file

    Default: append

  • ChainLogger

    Combines multiple loggers into one single logger.

    import { ConsoleLogger, FileLogger, ChainLogger } from 'patreon-dl';
    
    const consoleLogger = new ConsoleLogger(...);
    const fileLogger = new FileLogger(...);
    const chainLogger = new ChainLogger([ consoleLogger, fileLogger ]);
    
    const downloader = await PatreonDownloader.getInstance(url, {
        ...
        logger: chainLogger
    });
    

Aborting

To prematurely end a download process, use AbortController to send an abort signal to the downloader instance.

const downloader = await PatreonDownloader.getInstance(...);
const abortController = new AbortController();
downloader.start({
    signal: abortController.signal
});

...

abortController.abort();

// Downloader aborts current and pending tasks, then ends.

Workflow and Events

Workflow

  1. Downloader analyzes given URL and determines what targets to fetch.
  2. Downloader begins fetching data from Patreon servers. Emits fetchBegin event.
  3. Downloader obtains the target(s) from the fetched data for downloading.
  4. For each target (which can be a campaign, product or post):
    1. Downloader emits targetBegin event.
    2. Downloader determines whether the target needs to be downloaded, based on downloader configuration and target info such as accessibility.
      • If target is to be skipped, downloader emits targetEnd event with isSkipped: true. It then proceeds to the next target, if any.
    3. If target is to be downloaded, downloader saves target info (subject to downloader configuration), and emits phaseBegin event with phase: saveInfo. When done, downloader emits phaseEnd event.
    4. Downloader begins saving media belonging to target (again, subject to downloader configuration). Emits phaseBegin event with phase: saveMedia.
      1. Downloader saves files that do not need to be downloaded, e.g. embedded video / link info.
      2. Downloader proceeds to download files (images, videos, audio, attachments, etc.) belonging to the target in batches. For each batch, downloader emits phaseBegin event with phase: batchDownload. When done, downloader emits phaseEnd event with phase: batchDownload.
        • In this phaseBegin event, you can attach listeners to the download batch to monitor events for each download. See Download Task Batch.
    5. Downloader emits phaseEnd event with phase: saveMedia.
    6. Downloader emits targetEnd event with isSkipped: false, and proceeds to the next target.
  5. When there are no more targets to be processed, or a fatal error occurred, downloader ends with end event.

Events

const downloader = await PatreonDownloader.getInstance(...);

downloader.on('fetchBegin', (payload) => {
    ...
});

downloader.start();

Each event emitted by a PatreonDownloader instance has a payload, which is an object with properties containing information about the event.

Event Description
fetchBegin

Emitted when downloader begins fetching data about target(s).

Payload properties:

  • targetType: the type of target being fetched; one of product, post or post.

targetBegin

Emitted when downloader begins processing a target.

Payload properties:

targetEnd

Emitted when downloader is done processing a target.

Payload properties:

  • target: the target processed; one of Campaign, Product or Post.
  • isSkipped: whether target was skipped.

If isSkipped is true, the following additional properties are available:

  • skipReason: the reason for skipping the target; one of the following enums:
    • TargetSkipReason.Inaccessible
    • TargetSkipReason.AlreadyDownloaded
    • TargetSkipReason.UnmetMediaTypeCriteria
  • skipMessage: description of the skip reason.

phaseBegin

Emitted when downloader begins a phase in the processing of a target.

Payload properties:

  • target: the subject target of the phase; one of Campaign, Product or Post.
  • phase: the phase that is about to begin; one of saveInfo or batchDownload.

If phase is batchDownload, the following additional property is available:

  • batch: an object representing the batch of downloads to be executed by the downloader. For monitoring downloads in the batch, see Download Task Batch.

phaseEnd

Emitted when a phase ends for a target.

Payload properties:

  • target: the subject target of the phase; one of Campaign, Product or Post.
  • phase: the phase that has ended; one of saveInfo or batchDownload.

end

Emitted when downloader ends.

Payload properties:

  • aborted: boolean indicating whether the downloader is ending because of an abort request
  • error: if downloader ends because of an error, then error will be the captured error. Note that error is not necessarily an Error object; it can be anything other than undefined.
  • message: short description about the event

Download Task Batch

Files are downloaded in batches. Each batch is provided in the payload of phaseBegin event with phase: batchDownload. You can monitor events of individual downloads in the batch as follows:

downloader.on('phaseBegin', (payload) => {
    if (payload.phase === 'batchDownload') {
        const batch = payload.batch;
        batch.on(event, listener);
    }
})

Note that you don't have to remove listeners yourself. They will be removed once the batch ends and is destroyed by the downloader.

Download Task

Each download task in a batch is represented by an object with the following properties:

Property Description
id ID assigned to the task.
src The source of the download; URL or otherwise file path if downloading video from a previously-downloaded m3u8 playlist.
srcEntity The Downloadable item from which the download task was created.
retryCount The current retry count if download failed previously.
resolvedDestFilename The resolved destination filename of the download, or null if it has not yet been resolved.
resolvedDestFilename The resolved destination file path of the download, or null if it has not yet been reoslved.
getProgress() Function that returns the download progress.

Events

Each event emitted by a download task batch has a payload, which is an object with properties containing information about the event.

Event Description
taskStart

Emitted when a download starts.

Payload properties:

  • task: the download task

taskProgress

Emitted when a download progress is updated.

Payload properties:

  • task: the download task
  • progress: (object)
    • destFilename: the destination filename of the download
    • destFilePath: the destination file path of the download
    • lengthUnit: the unit measuring the progress. Generally, it would be 'byte', but for videos the unit would be 'second'.
    • length: content length, measured in lengthUnit.
    • lengthDownloaded: length downloaded so far, measured in lengthUnit.
    • percent: percent downloaded
    • sizeDownloaded: size of file downloaded (kB)
    • speed: download speed (kB/s)

    Note: sometimes length is undefined, in which case percent will also be undefined.

taskComplete

Emitted when a download is complete.

Payload properties:

  • task: the download task

taskError

Emitted when a download error occurs.

Payload properties:

  • error: (object)
    • task: the download task
    • cause: Error object or undefined
  • willRetry: whether the download will be reattempted

taskAbort

Emitted when a download is aborted.

Payload properties:

  • task: the download task

taskSkip

Emitted when a download is skipped.

Payload properties:

  • task: the download task
  • reason: (object)
    • name: destFileExists, includeMediaByFilenameUnfulfilled, dependentTaskNotCompleted or other
    • message: string indicating the skip reason

If reason.name is destFileExists, reason will also contain the following property:

  • existingDestFilePath: the existing file path that is causing the download to skip

taskSpawn

Emitted when a download task is spawned from another task.

Payload properties:

  • origin: the original download task
  • spawn: the spawned download task

complete

Emitted when the batch is complete and there are no more downloads pending.

Payload properties: none

Web server

patreon-dl comes with a web server for serving downloaded content. You can utilize the web server as follows:

import { WebServer } from 'patreon-dl';

const server = new WebServer(options);
await server.start();

console.log(`Web server listening on port: `, server.getConfig().port);

...

await server.stop();

options is an object with the following properties (all optional):

Option Description
dataDir Path to directory containing downloaded content. This mirrors the outDir downloader option. Default: current working directory.
port Port number to listen on. Default: 3000 or a random port number if 3000 is already in use.
logger See Logger, but note that creation of FileLogger is different in the context of web server logging (see below).

Web server file logging

To create a file logger for the web server:

const fileLogger = new FileLogger({
  logFilePath: 'path/to/log/file',
  fileExistsAction: 'append' // or 'overwrite'
});

const server = new WebServer({
  ...
  logger: fileLogger
});