-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Original Publish Date #248
Comments
This isn’t of real interest, as, in my opinion, the date the physical book was published is not relevant data to audiobooks. They are very different mediums, and should not be globbed together. I will concede date of first publish of an audiobook could be interesting, but that’s something which requires a release group format of metadata. This is something which AudiobookDB will be able to handle. Scraping implementations are bound to fail, it’s always just a matter of time. This is why Audnexus is an API first, scraping second approach, with massive safeguards when scraping. |
Just wanted to post an update on my opinion on this. The main reason I feel as though the original publish date of a book is useful is that, in any real world use of this tool, sorting by the original publish date is the only way to guarantee books are sorted in the correct order, especially in a series. Audible frequently removes original versions of audiobooks in favor of "movie editions", which often aren't any different from an original version except for the cover. If you try to use the release date of the movie version to sort a series of books in the order you should read them, you'll often end up with the movie edition of the first book in a series last, or towards the end. For a practical example, I'd use the Plex metadata agent that is based on this tool. I'd argue that sorting audiobooks on Plex simply by the date that an audiobook version of a book is released, is almost entirely useless compared to when the book was originally published. And as far as how to get that info on a book, I discovered that Readarr (which I linked in my original post) actually includes their API key for the Goodreads API in the source code, albeit in an obfuscated way. I tried implementing that in my own project for tagging audiobook files with proper metadata and it works like a charm. I'm not necessarily encouraging using theirs, but if you could get your hands on a Goodreads API key, they definitely provide that information with an API first approach like you said. |
Hmm, you make a good point there. I'm not opposed to including original publication date where available. I would probably just not make it the default sort. My concern with even integrating Goodreads is they've made it clear they don't intend to support the public API moving forward, so the rug could be pulled from under us at any moment. I also wouldn't want to put Readarr at risk of having their key revoked, as I'm not sure what if any TOS they agreed to in getting the key. I'll still take a look at their implementation to see if I can glean any knowledge. |
I understand not making it the default, my main point was just that its an important field to have in terms of overall audiobook metadata. And definitely understandable not wanting to yoink their API key, I've only used it in a project that isn't public and I'm only doing so because they're confident enough in using the same API key for each self hosted instance of their app that is out there. And if you're at all curious what the results from their API looks like, here is an example request: https://gist.github.com/csandman/f86dabe760a90477504c1a15fcada874 Unfortunately the response is in XML, but I was able to use the As far as a them closing their API, I definitely understand not wanting to rely on an API that is planned to be removed. I've been keeping my eyes out for any alternatives that offer the same feature but haven't yet found one. If I do though, I'll definitely post any updates. |
XML shouldn't be too big of a problem. I forgot they're distributing the API key, so I wouldn't be in breach of TOS since it's on my system already (I think). I'll see if I can get some preliminary testing going this week. As for other services, AudiobookDB is almost polished enough for public usage (frontend still needs to be written), so keep an eye out for that when it's opened to public 😏 It has Audnexus integration as well for import logic, so it makes sense to add Goodreads here first and then over there. |
Sounds good! So what exactly is AudiobookDB? Is there a public repo for it yet? Is it just supposed to be a collection of data acquired from audnexus essentially? |
Also, here's an example of how I'm parsing the Goodreads api in TypeScript in case it could offer you any ideas: https://gist.github.com/csandman/dba05dc48f29592d0db535282c00a2af I made it a little hastily but its mostly effective. |
With some knowledge, you can obtain an access token by make a request to https://www.goodreads.com/oauth/grant_access_token.xml. Goodreads use the https://api.amazon.com/auth/register endpoint to register a new goodreads device. These way you got a private Amazon access token which can be used to authenticate your requests to the Goodreads API and the |
@mkb79 any chance you could give more details on what you're describing? do you still need a goodreads API key in the first place to make that work? |
@csandman |
I am definitely curious about the exact process, because I can't seem to find any up to date resources on it. Overall it could be helpful for this issue as well so you could post it here, but otherwise you could post more details on one of the gists above if you'd like. |
Here are a proof-of-concept how registration and deregistration works: import base64
import gzip
import hashlib
import hmac
import json
import secrets
import uuid
from datetime import datetime
from io import BytesIO
from functools import partialmethod
from typing import Tuple, Union
import httpx
from pyaes import AESModeOfOperationCBC, Encrypter, Decrypter
USER_AGENT = "AmazonWebView/GoodreadsForIOS App/4.0.1/iOS/15.4.1/iPhone"
FRC_SIG_SALT: bytes = b"HmacSHA256"
FRC_AES_SALT: bytes = b"AES/CBC/PKCS7Padding"
class FrcCookieHelper:
def __init__(self, password: str) -> None:
self.password = password.encode()
def _get_key(self, salt: bytes) -> bytes:
return hashlib.pbkdf2_hmac("sha1", self.password, salt, 1000, 16)
get_signature_key = partialmethod(_get_key, FRC_SIG_SALT)
get_aes_key = partialmethod(_get_key, FRC_AES_SALT)
@staticmethod
def unpack(frc: str) -> Tuple[bytes, bytes, bytes]:
pad = (4 - len(frc) % 4) * "="
frc = BytesIO(base64.b64decode(frc+pad))
frc.seek(1) # the first byte is always 0, skip them
return frc.read(8), frc.read(16), frc.read() # sig, iv, data
@staticmethod
def pack(sig: bytes, iv: bytes, data: bytes) -> str:
frc = b"\x00" + sig[:8] + iv[:16] + data
frc = base64.b64encode(frc).strip(b"=")
return frc.decode()
def verify_signature(self, frc: str) -> bool:
key = self.get_signature_key()
sig, iv, data = self.unpack(frc)
new_signature = hmac.new(key, iv + data, hashlib.sha256).digest()
return sig == new_signature[:len(sig)]
def decrypt(self, frc: str, verify_signature: bool = True) -> bytes:
if verify_signature:
self.verify_signature(frc)
key = self.get_aes_key()
sig, iv, data = self.unpack(frc)
decrypter = Decrypter(AESModeOfOperationCBC(key, iv))
decrypted = decrypter.feed(data) + decrypter.feed()
decompressed = gzip.decompress(decrypted)
return decompressed
def encrypt(self, data: Union[str, dict]) -> str:
if isinstance(data, dict):
data = json.dumps(data, indent=2, separators=(",", " : ")).replace("/", "\\/").encode()
compressed = BytesIO()
with gzip.GzipFile(fileobj=compressed, mode="wb", mtime=False) as f:
f.write(data)
compressed.seek(8)
compressed.write(b"\x00\x13")
compressed = compressed.getvalue()
key = self.get_aes_key()
iv = secrets.token_bytes(16)
encrypter = Encrypter(AESModeOfOperationCBC(key, iv))
encrypted = encrypter.feed(compressed) + encrypter.feed()
key = self.get_signature_key()
signature = hmac.new(key, iv + encrypted, hashlib.sha256).digest()
packed = self.pack(signature, iv, encrypted)
return packed + len(packed) % 4 * "="
def register(username, password):
url = "https://api.amazon.com/auth/register"
device_serial = secrets.token_hex(16).upper()
frc = {
"ApplicationVersion": "4.1",
"DeviceOSVersion": "iOS/15.5",
"ScreenWidthPixels": "428",
"TimeZone": "+02:00",
"ScreenHeightPixels": "926",
"ApplicationName": "Goodreads",
"DeviceJailbroken": False,
"DeviceLanguage": "en-DE",
"DeviceFingerprintTimestamp": round(datetime.utcnow().timestamp()) * 1000,
"ThirdPartyDeviceId": str(uuid.uuid4()).upper(),
"DeviceName": "iPhone",
"Carrier": "Vodafone.de"
}
frc = FrcCookieHelper(device_serial).encrypt(frc)
headers = {
"x-amzn-identity-auth-domain": "goodreads.com",
"User-Agent": USER_AGENT,
"Accept-Encoding": "gzip",
"Accept": "application/json",
"Accept-Language": "en-DE",
"Accept-Charset": "utf-8"
}
json_body = {
"requested_extensions": [
"device_info",
"customer_info"
],
"cookies": {
"website_cookies": [],
"domain": ".goodreads.com"
},
"registration_data": {
"domain": "Device",
"app_version": "4.1",
"device_type": "A3NWHXTQ4EBCZS",
"os_version": "15.5",
"device_serial": device_serial,
"device_model": "iPhone",
"app_name": "GoodreadsForIOS App",
"software_version": "1"
},
"auth_data": {
"user_id_password": {
"user_id": username,
"password": password
}
},
"user_context_map": {
"frc": frc
},
"requested_token_type": [
"bearer",
"mac_dms",
"website_cookies"
]
}
r = httpx.post(url, headers=headers, json=json_body)
return r
def deregister(access_token):
json_body = {"deregister_all_existing_accounts": True}
headers = {"Authorization": f"Bearer {access_token}"}
r = httpx.post(
"https://api.amazon.com/auth/deregister",
json=json_body,
headers=headers
)
return r
def refresh_access_token(refresh_token):
pass
def exchange_cookies(refresh_token):
pass You are need Then you can request the Goodreads API using |
Thanks for the detailed example! Now time to see if I can translate this to node... |
@csandman Have you tried this code out. If yes, could you successfully register/unregister a device? I've tested this only on my machine. |
haven't tried unregister yet but it does appear to be working! The token you're talking about I assume is the I'm also close to finishing a node version, but idk if I did it right. Translating all this buffer manipulation stuff is always a pain haha. |
Yes, this is the access token. The token is valid for 60 minutes after registration. Before the token is invalid, you have to deregister or refresh the token with the refresh token. Refreshing token is easy with the correct request headers, params and body. |
I'm having quite the time writing this in TS, since I've never worked with python's |
Will do, I feel like I'm close but for some reason it's still not working. I've done similar conversion of python code to TS code before, and I've found using the Nodejs Here's what I have so far if you want to check it out: EDIT: I ended up figuring it out! Man that was rough, had to learn way more about how the Let me know if you have any trouble getting it working! // types/goodreads.ts
export interface GoodreadsFrc {
ApplicationVersion: string;
DeviceOSVersion: string;
ScreenWidthPixels: string;
TimeZone: string;
ScreenHeightPixels: string;
ApplicationName: string;
DeviceJailbroken: boolean;
DeviceLanguage: string;
DeviceFingerprintTimestamp: number;
ThirdPartyDeviceId: string;
DeviceName: string;
Carrier: string;
}
export interface GoodreadsRegisterRequest {
requested_extensions: string[];
cookies: {
website_cookies: string[];
domain: string;
};
registration_data: {
domain: string;
app_version: string;
device_type: string;
os_version: string;
device_serial: string;
device_model: string;
app_name: string;
software_version: string;
};
auth_data: {
user_id_password: {
user_id: string;
password: string;
};
};
user_context_map: {
frc: string;
};
requested_token_type: string[];
}
export interface GoodreadsRegisterFailureResponse {
response: {
challenge: {
challenge_reason: string;
uri: string;
required_authentication_method: string;
};
};
request_id: string;
}
export interface GoodreadsRegisterSuccessResponse {
response: {
success: {
extensions: {
device_info: {
device_name: string;
device_serial_number: string;
device_type: string;
};
customer_info: {
account_pool: string;
preferred_marketplace: string;
country_of_residence: string;
user_id: string;
home_region: string;
name: string;
given_name: string;
source_of_country_of_residence: string;
};
};
tokens: {
mac_dms: {
device_private_key: string;
adp_token: string;
};
bearer: {
access_token: string;
refresh_token: string;
expires_in: string;
};
};
customer_id: string;
};
};
request_id: string;
}
export interface GoodreadsDeregisterFailureResponse {
response: {
error: {
code: string;
message: string;
};
};
request_id: string;
}
export interface GoodreadsDeregisterSuccessResponse {
response: {
success: Record<string, never>;
};
request_id: string;
}
export interface GoodreadsRefreshFailureResponse {
error_index: string;
error_description: string;
error: string;
}
export interface GoodreadsRefreshSuccessResponse {
access_token: string;
token_type: string;
expires_in: number;
}
export interface GoodreadsCookieFailureResponse {
response: {
error: {
code: string;
detail: string;
message: string;
};
};
request_id: string;
}
export interface GoodreadsCookie {
Path: string;
Secure: boolean;
Value: string;
Expires: string;
HttpOnly: boolean;
Name: string;
}
export interface GoodreadsCookieSuccessResponse {
response: {
tokens: {
cookies: {
".goodreads.com": GoodreadsCookie[];
};
};
};
request_id: string;
} import {
createCipheriv,
createDecipheriv,
createHmac,
pbkdf2Sync,
randomBytes,
randomFill,
randomUUID,
} from "crypto";
import fetch from "node-fetch";
import type {
GoodreadsCookieFailureResponse,
GoodreadsCookieSuccessResponse,
GoodreadsDeregisterFailureResponse,
GoodreadsDeregisterSuccessResponse,
GoodreadsFrc,
GoodreadsRefreshFailureResponse,
GoodreadsRefreshSuccessResponse,
GoodreadsRegisterFailureResponse,
GoodreadsRegisterRequest,
GoodreadsRegisterSuccessResponse,
} from "types/goodreads";
import { promisify } from "util";
import { gunzipSync, gzipSync } from "zlib";
const REGISTER_URL = "https://api.amazon.com/auth/register";
const DEREGISTER_URL = "https://api.amazon.com/auth/deregister";
const REFRESH_URL = "https://api.amazon.com/auth/token";
const COOKIES_URL = "https://api.amazon.com/ap/exchangetoken/cookies";
const USER_AGENT = "AmazonWebView/GoodreadsForIOS App/4.0.1/iOS/15.4.1/iPhone";
class FrcCookieHelper {
static FRC_SIG_SALT = Buffer.from("HmacSHA256");
static FRC_AES_SALT = Buffer.from("AES/CBC/PKCS7Padding");
static CIPHER_ALGORITHM = "aes-128-cbc";
password: string;
constructor(password: string) {
this.password = password;
}
getKey(salt: Buffer) {
return pbkdf2Sync(this.password, salt, 1000, 16, "sha1");
}
getSignatureKey() {
return this.getKey(FrcCookieHelper.FRC_SIG_SALT);
}
getAesKey() {
return this.getKey(FrcCookieHelper.FRC_AES_SALT);
}
static getRandomIv() {
return new Promise<Uint8Array>((resolve, reject) => {
randomFill(new Uint8Array(16), (err, arr) => {
if (err) {
reject(err);
}
resolve(arr);
});
});
}
static unpack(frc: string): [Buffer, Buffer, Buffer] {
const pad = "=".repeat(4 - (frc.length % 4));
const newFrc = Buffer.from(frc + pad, "base64");
const sig = newFrc.slice(1, 9);
const iv = newFrc.slice(9, 25);
const data = newFrc.slice(26);
return [sig, iv, data];
}
static pack(sig: Buffer, iv: Uint8Array, data: Buffer): string {
let frc = Buffer.concat([Buffer.from([0x00]), sig.slice(0, 8), iv, data]);
const rem = Buffer.from("=");
while (frc.indexOf(rem) === 0) {
frc = frc.slice(1);
}
while (frc.lastIndexOf(rem) === frc.length - 1) {
frc = frc.slice(0, frc.length - 1);
}
return frc.toString("base64");
}
verifySignature(frc: string): boolean {
const key = this.getSignatureKey();
const [sig, iv, data] = FrcCookieHelper.unpack(frc);
const hmac = createHmac("sha256", key);
hmac.write(Buffer.concat([iv, data]));
const newSignature = hmac.digest();
return sig === newSignature.slice(0, sig.length);
}
decrypt(frc: string, verifySignature = true): Buffer {
if (verifySignature) {
this.verifySignature(frc);
}
const key = this.getAesKey();
const [, iv, data] = FrcCookieHelper.unpack(frc);
const decipher = createDecipheriv(
FrcCookieHelper.CIPHER_ALGORITHM,
key,
iv
);
const decrypted = decipher.update(data);
const decryptedFinal = decipher.final();
const decryptedData = Buffer.concat([decrypted, decryptedFinal]);
return gunzipSync(decryptedData);
}
async encrypt(data: string | GoodreadsFrc): Promise<string> {
let dataStr: string;
if (typeof data === "object") {
dataStr = JSON.stringify(data, null, 2);
} else {
dataStr = data;
}
const zip = gzipSync(dataStr);
const gzippedData = Buffer.concat([
zip.slice(0, 8),
Buffer.from([0x00, 0x13]),
zip.slice(10),
]);
const aesKey = this.getAesKey();
const iv = await FrcCookieHelper.getRandomIv();
const cipher = createCipheriv(FrcCookieHelper.CIPHER_ALGORITHM, aesKey, iv);
const encrypted = cipher.update(gzippedData);
const encryptedFinal = cipher.final();
const encryptedData = Buffer.concat([encrypted, encryptedFinal]);
const sigKey = this.getSignatureKey();
const hmac = createHmac("sha256", sigKey);
hmac.write(Buffer.concat([iv, encryptedData]));
const signature = hmac.digest();
const packed = FrcCookieHelper.pack(signature, iv, encryptedData);
return packed + "=".repeat(packed.length % 4);
}
}
const randomBytesAsync = promisify(randomBytes);
export const register = async (username: string, password: string) => {
const deviceSerial = await randomBytesAsync(16).then((buf) =>
buf.toString("hex").toUpperCase()
);
const frcBase: GoodreadsFrc = {
ApplicationVersion: "4.1",
DeviceOSVersion: "iOS/15.5",
ScreenWidthPixels: "428",
TimeZone: "+02:00",
ScreenHeightPixels: "926",
ApplicationName: "Goodreads",
DeviceJailbroken: false,
DeviceLanguage: "en-DE",
DeviceFingerprintTimestamp: new Date().getTime(),
ThirdPartyDeviceId: randomUUID().toUpperCase(),
DeviceName: "iPhone",
Carrier: "Vodafone.de",
};
const frcHelper = new FrcCookieHelper(deviceSerial);
const frc = await frcHelper.encrypt(frcBase);
const headers = {
"x-amzn-identity-auth-domain": "goodreads.com",
"User-Agent": USER_AGENT,
"Accept-Encoding": "gzip",
Accept: "application/json",
"Accept-Language": "en-DE",
"Accept-Charset": "utf-8",
};
const body: GoodreadsRegisterRequest = {
requested_extensions: ["device_info", "customer_info"],
cookies: {
website_cookies: [],
domain: ".goodreads.com",
},
registration_data: {
domain: "Device",
app_version: "4.1",
device_type: "A3NWHXTQ4EBCZS",
os_version: "15.5",
device_serial: deviceSerial,
device_model: "iPhone",
app_name: "GoodreadsForIOS App",
software_version: "1",
},
auth_data: {
user_id_password: {
user_id: username,
password,
},
},
user_context_map: {
frc,
},
requested_token_type: ["bearer", "mac_dms", "website_cookies"],
};
const res = await fetch(REGISTER_URL, {
method: "POST",
headers,
body: JSON.stringify(body),
});
const resData = (await res.json()) as
| GoodreadsRegisterSuccessResponse
| GoodreadsRegisterFailureResponse;
console.log("GOODREADS AUTH RESPONSE");
console.dir(resData, { depth: null });
if ("challenge" in resData.response) {
throw new Error(resData.response.challenge.challenge_reason);
}
return resData as GoodreadsRegisterSuccessResponse;
};
export const deregister = async (
accessToken: string
): Promise<GoodreadsDeregisterSuccessResponse> => {
const res = await fetch(DEREGISTER_URL, {
method: "POST",
headers: {
Authorization: `Bearer ${accessToken}`,
},
body: JSON.stringify({
deregister_all_existing_accounts: true,
}),
});
const resData = (await res.json()) as
| GoodreadsDeregisterSuccessResponse
| GoodreadsDeregisterFailureResponse;
console.log("GOODREADS DEREGISTER RESPONSE");
console.dir(resData, { depth: null });
if ("error" in resData.response) {
throw new Error(resData.response.error.message);
}
return resData as GoodreadsDeregisterSuccessResponse;
};
export const refreshAccessToken = async (
refreshToken: string
): Promise<GoodreadsRefreshSuccessResponse> => {
const res = await fetch(REFRESH_URL, {
method: "POST",
headers: {
"x-amzn-identity-auth-domain": "goodreads.com",
"User-Agent": USER_AGENT,
"Accept-Encoding": "gzip",
"Content-Type": "application/x-www-form-urlencoded",
},
body: new URLSearchParams({
app_name: "GoodreadsForIOS App",
app_version: "4.0.1",
"di.sdk.version": "6.12.1",
source_token: refreshToken,
package_name: "com.goodreads.Goodreads",
"di.hw.version": "iPhone",
platform: "iOS",
requested_token_type: "access_token",
source_token_type: "refresh_token",
"di.os.name": "iOS",
"di.os.version": "15.4.1",
current_version: "6.12.1",
}),
});
const resData = (await res.json()) as
| GoodreadsRefreshSuccessResponse
| GoodreadsRefreshFailureResponse;
console.log("GOODREADS REFRESH TOKEN RESPONSE");
console.dir(resData, { depth: null });
if ("error_description" in resData) {
throw new Error(resData.error_description);
}
return resData as GoodreadsRefreshSuccessResponse;
};
export const exchangeCookies = async (
refreshToken: string
): Promise<GoodreadsCookieSuccessResponse> => {
const res = await fetch(COOKIES_URL, {
method: "POST",
headers: {
"x-amzn-identity-auth-domain": "goodreads.com",
"User-Agent": USER_AGENT,
"Accept-Encoding": "gzip",
"Content-Type": "application/x-www-form-urlencoded",
},
body: new URLSearchParams({
"openid.assoc_handle": "amzn_goodreads_web_na",
app_name: "GoodreadsForIOS App",
app_version: "4.0.1",
"di.sdk.version": "6.12.1",
domain: ".goodreads.com",
source_token: refreshToken,
"di.hw.version": "iPhone",
cookies: "eyJjb29raWVzIjp7Ii5nb29kcmVhZHMuY29tIjpbXX19",
requested_token_type: "auth_cookies",
source_token_type: "refresh_token",
"di.os.name": "iOS",
"di.os.version": "15.4.1",
}),
});
const resData = (await res.json()) as
| GoodreadsCookieSuccessResponse
| GoodreadsCookieFailureResponse;
console.log("GOODREADS COOKIES RESPONSE");
console.dir(resData, { depth: null });
if ("error" in resData.response) {
throw new Error(resData.response.error.message);
}
return resData as GoodreadsCookieSuccessResponse;
}; |
I ended up figuring it out, updated my previous comment with a working example. |
Oh, I also finally tested actually using the bearer token to pull from the API, and it works! Probably should have made sure of that before I went through the whole process of converting the file, but glad it works anyway haha. |
Here are the function to refresh the access token: def refresh_access_token(refresh_token):
url = "https://api.amazon.com/auth/token"
headers = {
"x-amzn-identity-auth-domain": "goodreads.com",
"User-Agent": USER_AGENT,
"Accept-Encoding": "gzip"
}
body = {
"app_name": "GoodreadsForIOS App",
"app_version": "4.0.1",
"di.sdk.version": "6.12.1",
"source_token": refresh_token,
"package_name": "com.goodreads.Goodreads",
"di.hw.version": "iPhone",
"platform": "iOS",
"requested_token_type": "access_token",
"source_token_type": "refresh_token",
"di.os.name": "iOS",
"di.os.version": "15.4.1",
"current_version": "6.12.1"
}
r = httpx.post(url, data=body, headers=headers)
return r |
Finally here comes the exchange token for cookies part: def exchange_cookies(refresh_token):
url = "https://api.amazon.com/ap/exchangetoken/cookies"
headers = {
"x-amzn-identity-auth-domain": "goodreads.com",
"User-Agent": USER_AGENT,
"Accept-Encoding": "gzip"
}
body = {
"openid.assoc_handle": "amzn_goodreads_web_na",
"app_name": "GoodreadsForIOS App",
"app_version": "4.0.1",
"di.sdk.version": "6.12.1",
"domain": ".goodreads.com",
"source_token": refresh_token,
"di.hw.version": "iPhone",
"cookies": "eyJjb29raWVzIjp7Ii5nb29kcmVhZHMuY29tIjpbXX19",
"requested_token_type": "auth_cookies",
"source_token_type": "refresh_token",
"di.os.name": "iOS",
"di.os.version": "15.4.1"
}
r = httpx.post(url, data=body, headers=headers)
return r |
So, it’s on yours for further progress ;)! If you need any help, feel free to contact me. |
@mkb79 thanks for all the extra info! I was just thinking about asking if you had a hint on the other requests. Also thanks a ton for this whole thing, I can definitely think of a few applications for it! So I was able to get the refresh function working, but I did have to add {
response: {
error: {
code: 'MissingValue',
detail: 'Missing parameter: app_name',
message: 'One or more required values are missing'
}
},
request_id: 'F8Q090DH5W85QP7REAGK'
} Which is odd because I copied your I was also curious, what's the |
The right Content-Type for
Maybe the solution is sending the data in urlencoded format? Or can you post your code implementation?
Some requests using these cookies in addition to the access token. But I had no issues sending the request without the cookies. So I post the code here for completeness. Maybe this cookies can be used to make authenticated requests to Goodreads.com?! Edit: |
Interesting that you say that because I saw the same thing in some different AWS docs, but it has been working so far for all of the other requests to use the import FormData from "form-data";
import fetch from "node-fetch";
const COOKIES_URL = "https://api.amazon.com/ap/exchangetoken/cookies";
const USER_AGENT = "AmazonWebView/GoodreadsForIOS App/4.0.1/iOS/15.4.1/iPhone";
export const exchangeCookies = async (refreshToken: string) => {
const headers = {
"x-amzn-identity-auth-domain": "goodreads.com",
"User-Agent": USER_AGENT,
"Accept-Encoding": "gzip",
"Content-Type": "application/x-www-form-urlencoded",
};
const body = {
"openid.assoc_handle": "amzn_goodreads_web_na",
app_name: "GoodreadsForIOS App",
app_version: "4.0.1",
"di.sdk.version": "6.12.1",
domain: ".goodreads.com",
source_token: refreshToken,
"di.hw.version": "iPhone",
cookies: "eyJjb29raWVzIjp7Ii5nb29kcmVhZHMuY29tIjpbXX19",
requested_token_type: "auth_cookies",
source_token_type: "refresh_token",
"di.os.name": "iOS",
"di.os.version": "15.4.1",
};
const formBody = new FormData();
Object.entries(body).forEach(([key, value]) => {
formBody.append(key, value);
});
const res = await fetch(COOKIES_URL, {
method: "POST",
headers,
body: formBody,
});
const resData = await res.json();
console.log("COOKIES RESPONSE");
console.dir(resData, {
depth: null,
});
return resData;
}; I'm not overly concerned about making this function work though, I don't think I'd really need it for my use case. But like you said, completeness is nice. |
This works for me
Edit: |
Some post requests made by the Goodreads/Audible/Kindle iOS Apps are json encoded and some url encoded. The refresh token and cookie exchange requests are url encoded. Maybe json will work too, but I doesn’t try this out yet. |
Great that worked for me, thanks for all the help @mkb79! This will definitely be useful for my own project, and I'm sure @djdembeck can get some use out of it! Just a heads up, I edited my code again with a more complete example including all of the functions originally mentioned, as well as types for everything (success and error responses) and some basic error handling. @djdembeck I'm sure you'll have other ideas about how to handle the types and errors, as well as the responses (I'm still pretty new to TypeScript) but hopefully this is a good starting point! |
Awesome! Love seeing the collaboration. mkb79 is an absolute genius. First he gave us Audible's API docs and methods and now GR 😅 Tremendous help mate and thanks for stopping in to help us on this! I'm tracking the GR work in a project: https://github.com/laxamentumtech/audnexus/projects/2 |
Thanks goes to all, who helped out. Without @csandman there where no port to Node. I'm only a sparetime hobby coder with less time (currently)! But if I can help out more, feel free to contact me. |
@mkb79 I definitely appreciate the help, I've already made good use of this in my own project! One last question though, I notice the |
That’s good to hear. Is your project something you want to make public?
This is for completion. I used this to decrypt my own frc cookies, which are set by Amazon. I‘m interested in reverse engineering software. But for your use case, you don’t need these functions. |
So my project is a self hosted web app with the main goal of downloading books from OverDrive public libraries (with chapters), scraping metadata from Audible, scraping covers from Audible/iTunes, and then merging them into .m4b files using The other big part of it is an editor I made for customizing the metadata and adjusting the positions of chapters/adding and removing chapters before merging. This part ended up working so well that I ended up adding the functionality to import local book files to merge/fix the metadata on them as well. You can check out some example images of the project if you're curious. The main reason I'm hesitant about completely open sourcing it is for fear that potential employers might not look so kindly on the "piratey" nature of the app and it could hurt my chances of getting a job in the future, however unlikely that is. Which is a shame because it's the most work I've ever put into a side project haha. However I have no problem sharing it with people who are interested, and just the other day ended up sharing it with a few people on Reddit who wanted to try it out. I'll invite you with read access to the repo in case you want to try it out or check out the source code. @djdembeck I'll add you too in case you're curious, it seems up your alley. It might even give you some ideas for bragibooks!
Great, just wanted to make sure! And I can see the appeal, I personally get a certain satisfaction from translating one language into another. |
@csandman |
@csandman this is quite impressive. I wasn't expecting the level of polish you've put into your project. I wouldn't make it public, personally. Less for employers (who usually just care about code readability/competence), and more for the hot water of removing DRM. That is my opinion, however, you've clearly put a lot of work into this, and it would be a shame for it to never see the light of day. |
I appreciate it!
Yeah that's definitely my other concern. For the most part, providing source code for tools that accomplish the removal of DRM isn't usually attacked that often (from what I've seen), but its still a concern nonetheless. I appreciate the feedback though! It has been a long time in the making. It started out as a command line app which just downloaded an entire OverDrive library, scraped metadata for each book, and merged the files all in one go. But I really wanted to be able to run it on my Unraid server, and I wanted to be able to tweak the metadata before merging, so I figured a webapp would be the best way to go. Conveniently, next.js has API routes built in, so moving from a CLI app to a web app wasn't that hard. The main thing that gave me problems was getting the Docker image working. I have very little experience with docker besides this project and it has been a steep learning curve for me haha. And maybe I will find a way to share it some day. I do think a lot of people would get some good use out of it. You're welcome to use it if you want, or any of the source code if any of it would be helpful in any way. I didn't bother to add a license because I didn't plan on OS'ing it, but I'll probably add an MIT license now that some people have access to it. |
By the way @djdembeck, one more thing I realized about this is that in many cases, you will get the correct book from Goodreads back if you use the ASIN as the search term. From what I've seen, you will either get the correct Audible book, or no results if you use the ASIN, which would always be more reliably correct than matching by the title/author. I'll need to do some tests to see how often the ASIN matches to see whether or not this is a reliable method, but it's at least a good first step because you'd know the result is correct. In my app, I've also started searching both Audible and Goodreads using the ISBN I get back from OverDrive. This seems to give me matches even more reliably, but unfortunately I don't think there's a way to get the ISBN back from Audible (as far as I know). If you end up figuring out how to get that though, it would probably work even better. |
Let me know your findings about the reliability of searching by ASIN! I've been gearing up Audnexus to be far easier to add 'plugins' to for providers. It's likely I'll have the Goodreads plug-in require an explicit ID, and make clients (Plex, AudiobookShelf, etc) do the actual searching. This is how Audible is currently setup so I think it's logical to stick to that design. |
@djdembeck After some more testing (not a huge sample size, maybe 20 books) I've found that I had around a 50% success rate or a little lower using the ASIN. So it's definitely not something I'd say you would be able to consistently rely on, but it is still something worth considering as a first approach IMO, as the result returned should be 100% accurate vs a manual comparison. |
By the way, I figured I'd share my complete implementation of the Goodreads API I'm using in my project: https://github.com/book-tools/audiobook-scraper-web/blob/main/src/backend/api/goodreads/goodreads-api.ts It's the project I mentioned previously, it's still private as well, but you should have access. Not sure if you've done any work on this yourself, but you could always use some stuff from mine if you haven't. You're welcome to fully copy/modify it if you'd like, take some parts of it, or completely ignore it! Up to you. I only implemented three Goodreads' endpoints, but I'm not really sure if you'd need more than that. One thing to note, is in my last commit, I switched to using
Sorry for getting off-topic, but I think this would be a great tool for Audnexus, especially considering how many guards it seems you have in place to make sure your data is correct. |
Your project has grown quite a bit. Congrats on the work! I might need to steal you some for AudiobookDB 😛
|
Thanks! You're more than welcome to use whatever you like, I didn't make that project with a monetary goal in mind, so I'm happy to share if it helps make other apps better.
I discovered it a few months ago and to me, it feels like the future of runtime type checking in situations where types can't be inferred. The same guy who made it also made the initial version of The great part about And one of the coolest parts is that you can get the typing of the output object without having to redefine it as a standalone type. import { z } from "zod";
const userSchema = z.object({
username: z.string(),
});
userSchema.parse({ username: "Ludwig" });
// extract the inferred type
type User = z.infer<typeof userSchema>;
// { username: string } |
I want to revisit this. @csandman are you using GR lookup (via search?) or explicit ID lookup? I'm thinking the latter would be safe to add to Audnexus, but I don't love the lack of automation. |
By ID lookup, do you mean ASIN lookup like I mentioned before? Because in my app I am doing that, but only as a second approach because it's frequently not attached to books on Goodreads. My first approach is to match based on the ISBN of the book, because searching that on Goodreads tends to give results at a much higher rate. And like I mentioned before, the ISBN is available on all books on OverDrive, where I'm sourcing my books from originally. And finally, I just do some fuzzy matching on the title and authors of the book after searching for the book's title. Alternatively, if you mean lookup by Goodreads ID, I'm not sure what you mean. I wouldn't have access to that before running this function personally Here is my code for matching if you're curious: export const stripDiacretics = (str: string): string =>
str.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
export const removeNonWordChars = (str: string): string =>
str.replace(/\W+/g, " ");
export const removeSpaces = (str: string): string => str.replace(/\s/g, "");
export const removeExtraSpaces = (str: string): string =>
str.replace(/\s{2,}/g, " ").trim();
export const simplify = (str: string): string =>
removeSpaces(removeNonWordChars(stripDiacretics(str).toLowerCase()));
export const fuzzyMatch = (
str1: string,
str2: string,
checkIncludes = false
) => {
const simpleStr1 = simplify(str1);
const simpleStr2 = simplify(str2);
return (
simpleStr1 === simpleStr2 ||
(checkIncludes &&
(simpleStr1.includes(simpleStr2) || simpleStr2.includes(simpleStr1)))
);
};
export const checkAuthorOverlap = (authors1: Author[], authors2: Author[]) => {
for (let i = 0; i < authors1.length; i += 1) {
for (let j = 0; j < authors2.length; j += 1) {
if (fuzzyMatch(authors1[i].name, authors2[j].name)) {
return true;
}
}
}
return false;
};
/**
* Find a goodreads book based on an input book
*
* @param book - A book to find a match for
* @returns A Goodreads book or null if no match was found
*/
async function getGoodreadsMatch(book: Book): Promise<GoodreadsBook | null> {
const settings = await loadSettings();
const goodreadsUser = settings.goodreadsUser || process.env.GOODREADS_USER;
const goodreadsPass = settings.goodreadsPass || process.env.GOODREADS_PASS;
const goodreads = new GoodreadsApi(goodreadsUser, goodreadsPass);
await goodreads.init();
// First try to find a match based on the ISBN if available
// this tends to have the best results
if (book.isbn) {
const goodreadsItemsFromIsbn = await goodreads.searchBooks(book.isbn);
if (goodreadsItemsFromIsbn.length) {
return goodreadsItemsFromIsbn[0];
}
}
// If no match was found, try to find a match based on the ASIN
// this is less likely to return any results but can still be useful
if (book.asin) {
const goodreadsItemsFromAsin = await goodreads.searchBooks(book.asin);
if (goodreadsItemsFromAsin.length) {
return goodreadsItemsFromAsin[0];
}
}
// If no match was found, try to find a match based on the title
// the author is still being compared so it shouldn't provide incorrect results
const goodreadsItems = await goodreads.searchBooks(book.title, {
searchField: "title",
});
for (let i = 0; i < goodreadsItems.length; i += 1) {
const item = goodreadsItems[i];
if (
checkAuthorOverlap(item.authors, book.authors) &&
fuzzyMatch(book.title, item.title, true)
) {
return item;
}
}
return null;
} I know manual matching on the title and author(s) probably isn't the most appealing prospect, but it is highly successful. And if you were to implement a similar scoring function, like you have in the Audnexus Plex agent, you could get even better results. One alternative approach I thought of was trying to get a matching ISBN for your Audible books, and using that to search Goodreads, as I believe all Goodreads books have an ISBN attached to them. I got some inspiration from this issue in the Unfortunately, the way the plugin mentioned in that issue works is by searching for the ISBN of a book based on it's title/author. Again, not a bulletproof solution. Additionally, I don't believe that package finds the ISBN of the audiobook version of a book, probably just the origianal physical book's. However, my idea is that if you could somehow find the Audiobook version of the ISBN (the same edition that's on Audible), you could run a lookup on that ISBN on Audible to confirm you have the right book. e.g. https://www.audible.com/search?keywords=9780525633723 Because Audible allows you to search by ISBN, you would only ever end up with one result when running that search. So this is a little convoluted (and would involve a few steps) but here is the process as I envisioned it:
This is super convoluted, I know, I'm mostly just brainstorming at this point to works towards a fail proof matching solution. It's pretty frustrating that you can't get the book's ISBN directly from the Audible API, considering they obviously have that information available if you're allowed to search by it. And realistically, having the ISBN would probably be one of the most useful things you could offer from this package. The main approach I've thought of for actually finding the audiobook ISBN for a book is using OverDrive. There are two different ways to search OverDrive.
In either case, you can get the ISBN by running the script I mentioned above, and then once you have the info access it like this: mediaItems[bookId].formats[0].identifiers and then you look for the identifier with Accessing the media item on the root OverDrive site is a bit different as well. There the info is added to that page at: dataLayer[0].content.formats[0].identifiers And the media item could be pulled from the page with a regex that's something like this: const bookPage = await fetch(bookUrl);
const bookPageText = await bookPage.text();
const dataLayerMatch = /dataLayer ?=(.*?]);/.exec(bookPageText);
const dataLayer = JSON.parse(dataLayerMatch[1].trim())
const mediaItem = dataLayer[0].content; Sorry for the information dump, I just figured I'd provide you with everything I think might help you in this process that I've learned over time building my own app. A lot of this parsing is already coded by me in my own app, so again, if you want to use anything from it feel free! |
Closing this because AudiobookDB is opening soon-ish and has this capability. |
Is AudiobookDB open source? Or will it be? You've mentioned it a couple times but I'm not exactly sure what it is yet haha. It sounds very up my alley though, and I'm very curious to check it out! And could also be interested in helping with it! |
It's going to start closed source. The scale I would like to bring it up to isn't achievable without significant backing. I meant to release it last year, but had a personal loss that killed my ambition to work on it. Thankfully doing better now and the ideas are flowing again! Hoping to have something in testers hands in a month or so. |
By backing do you mean like hosting resources?
Ah yeah, you've mentioned that before, I'm sorry to hear it. Glad to hear you're doing better now though! Well either way, excited to check it out! I can guess the general purpose for it, and it's something I've thought about doing myself, so I'm very curious how it will turn out. |
Hey! I've been watching this thread for a while for all the helpful Goodreads reverse engineering, and am also interested in this AudiobookDB idea! I've contributed a bit to the audiobookshelf project, and its developer @advplyr had thought about starting a similar audiobook database. He owns bookdb.org and started the bookdb repo. I don't know what your vision for AudiobookDB is, but the audiobookshelf community is very interested in a crowdsourced database for everything audiobooks, so even if you wouldn't want to collaborate on its development I'm guessing integration with audiobookshelf would be mutually beneficial. Food for thought! |
Hey, thanks for reaching out! I'll definitely be making announcements to various communities when the beta opens. As it stands, I know audiobookshelf is using our other project, Audnexus. To divulge a bit about my progress on AbDB: I'm currently working on the CRUD frontend pages for main data types. These are mostly done, but I'm unsatisfied with some of the designs so I've been refining them. Once those are in place, I'll reach out to some interested people to provide feedback on the MVP, so we can make things like schema and major design changes before launching beta to everyone. Additionally, we are having our lawyer work on what we are allowed to host from major companies, write our ToS, etc. This doesn't have much time impact until wide release. The MVP won't have edit history, which has been a technical hurdle that I don't want to block getting feedback. The API is and has been ready to go. At the MVP stage, I'm going to fork out Audnexus agent into an AbDB agent for Plex to get some real world usage (and enjoy the fruit of my labor). I will make a public issues tracker and maybe a Discord for communication. API docs are already written and will be released when the MVP is setup on a dev server. Thanks for all the interest y'all! |
Coming here to follow up. I made an issue with details on how to test the alpha of AudiobookDB and help shape it a bit before wider usage: #689 |
One thing which I have not been able to find a consistent source for is an original publish date for audiobooks. I'm not talking about the date an audiobook was released, I'm talking specifically about the date the first edition of a book was released. In terms of organizing an Audiobook collection (in my case, in Plex), it is generally far more convenient to allow things to be sorted by when the books were published. This is especially true for books in a series which may have multiple releases for their audiobook versions.
I know GoodReads has this info, but I also know they closed their API access. However, it's surely still possible to scrape the info from their website with something like
cheerio
. I know Readarr has GoodReads scraping integration but I haven't had a chance to look through their code for how they do it yet.There is also the Google Books API, specifically the volume search which is a convenient way to search the Google books library with a JSON response, but they don't include the original publish date in that response. Which is weird because they do offer that info on their book pages. It could be realistic to use their API to find a book match and then use the books ID to scrape the Google Books page for that book.
Anyway, I'm not sure if attempting to scrape more sources like this is in the scope for this project, I've just been thinking that the original publish date for a book is one of the only things I haven't been able to get from Audible that is actually useful to have. I'm curious if you have any thoughts on the topic!
The text was updated successfully, but these errors were encountered: