-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#230 : Unsubscribe automation #236
Conversation
@madhav-relish is attempting to deploy a commit to the Inbox Zero Team on Vercel. A member of the Team first needs to authorize it. |
WalkthroughThe changes introduce a new Changes
Assessment against linked issues
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (4)
unsubscribe-automation/package.json (3)
1-5
: Improve project metadata for better documentation.Consider making the following improvements to the project metadata:
- Update the
main
field to point to the correct entry point (e.g., "dist/server.js" after compilation).- Add a meaningful
description
to provide context about the project.- Include relevant
keywords
to improve discoverability.- Specify the
author
information.Example:
{ "name": "unsubscribe-automation", "version": "1.0.0", "description": "A Playwright-based automation tool for email unsubscription processes", "main": "dist/server.js", "keywords": ["email", "unsubscribe", "automation", "playwright"], "author": "Your Name <your.email@example.com>", "license": "ISC" }Also applies to: 11-13
6-10
: Enhance scripts section with additional development tools.The current scripts are a good start, but consider adding the following to improve the development workflow:
- Implement the
test
script with an appropriate testing framework.- Add a
lint
script for code linting (e.g., ESLint).- Add a
format
script for code formatting (e.g., Prettier).- Add a
dev
script for running the server in development mode with auto-restart.Example:
"scripts": { "start": "ts-node -r tsconfig-paths/register src/server.ts", "build": "tsc", "test": "jest", "lint": "eslint . --ext .ts", "format": "prettier --write \"src/**/*.ts\"", "dev": "nodemon --exec ts-node -r tsconfig-paths/register src/server.ts" }Don't forget to install the necessary dev dependencies for these scripts.
14-21
: Enhance development dependencies for improved code quality and testing.Consider making the following improvements to the development dependencies:
- Add ESLint for code linting:
"eslint": "^8.0.0"
,"@typescript-eslint/eslint-plugin": "^5.0.0"
,"@typescript-eslint/parser": "^5.0.0"
- Add Prettier for code formatting:
"prettier": "^2.0.0"
- Add Jest for testing:
"jest": "^29.0.0"
,"@types/jest": "^29.0.0"
,"ts-jest": "^29.0.0"
- Use a caret range for @types/node to allow minor and patch updates:
"@types/node": "^22.5.4"
Example:
"devDependencies": { "@types/dotenv": "^8.2.0", "@types/jest": "^29.0.0", "@types/node": "^22.5.4", "@typescript-eslint/eslint-plugin": "^5.0.0", "@typescript-eslint/parser": "^5.0.0", "eslint": "^8.0.0", "jest": "^29.0.0", "playwright": "^1.47.2", "prettier": "^2.0.0", "ts-jest": "^29.0.0", "ts-node": "^10.9.2", "tsconfig-paths": "^4.2.0", "typescript": "^5.6.2" }unsubscribe-automation/src/server.ts (1)
47-47
: Useserver.log.info
instead ofconsole.log
for logging.For consistency and to leverage Fastify's logging system, replace
console.log
withserver.log.info
.Apply this diff to update the logging method:
- console.log(`Server is running on http://localhost:${port}`); + server.log.info(`Server is running on http://localhost:${port}`);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
- unsubscribe-automation/.env.example (1 hunks)
- unsubscribe-automation/package.json (1 hunks)
- unsubscribe-automation/src/main.ts (1 hunks)
- unsubscribe-automation/src/server.ts (1 hunks)
- unsubscribe-automation/tsconfig.json (1 hunks)
✅ Files skipped from review due to trivial changes (2)
- unsubscribe-automation/.env.example
- unsubscribe-automation/tsconfig.json
🧰 Additional context used
🪛 Biome
unsubscribe-automation/src/main.ts
[error] 148-155: This else clause can be omitted because previous branches break early.
Unsafe fix: Omit the else clause.
(lint/style/noUselessElse)
[error] 1-1: Some named imports are only used as types.
This import is only used as a type.
Importing the types with import type ensures that they are removed by the transpilers and avoids loading unnecessary modules.
Safe fix: Use import type.(lint/style/useImportType)
unsubscribe-automation/src/server.ts
[error] 45-45: Use Number.parseInt instead of the equivalent global.
ES2015 moved some globals into the Number namespace for consistency.
Safe fix: Use Number.parseInt instead.(lint/style/useNumberNamespace)
🔇 Additional comments (2)
unsubscribe-automation/package.json (1)
22-29
: Clarify and adjust runtime dependencies.The current dependencies look good, but consider the following adjustments:
Move
playwright
from devDependencies to dependencies, as it's likely needed for runtime automation:"playwright": "^1.47.2"Clarify the purpose of the
ai
package. If it's not necessary, consider removing it.Consider adding
@playwright/test
as a dev dependency if you plan to use Playwright's built-in test runner:"devDependencies": { "@playwright/test": "^1.47.2" }Ensure that the version of
@fastify/cors
(^10.0.1) is compatible with the version offastify
(^5.0.0) you're using. You might want to use a version that matches your Fastify major version.To verify the compatibility between Fastify and @fastify/cors, you can run:
This will show you the latest versions and their peer dependencies, helping ensure compatibility.
unsubscribe-automation/src/main.ts (1)
25-25
: Ensure the AI model 'gemini-1.5-flash' is available or handle model selectionThe model
"gemini-1.5-flash"
might not be available or may have usage restrictions. This could cause errors when generating content. Consider verifying the model's availability or providing a fallback.Run the following script to check available models:
If the model is unavailable, consider updating the code:
-const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" }); +const model = genAI.getGenerativeModel({ model: "default-model-name" });
unsubscribe-automation/src/main.ts
Outdated
@@ -0,0 +1,199 @@ | |||
import { chromium, Page } from "playwright"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use import type
for types-only imports to optimize module loading
The Page
import is only used as a type in your code. By using import type
, you ensure that the type is removed during transpilation, which can improve performance and avoid loading unnecessary modules.
Apply this diff to adjust the import statement:
-import { chromium, Page } from "playwright";
+import { chromium } from "playwright";
+import type { Page } from "playwright";
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
import { chromium, Page } from "playwright"; | |
import { chromium } from "playwright"; | |
import type { Page } from "playwright"; |
🧰 Tools
🪛 Biome
[error] 1-1: Some named imports are only used as types.
This import is only used as a type.
Importing the types with import type ensures that they are removed by the transpilers and avoids loading unnecessary modules.
Safe fix: Use import type.(lint/style/useImportType)
unsubscribe-automation/src/main.ts
Outdated
|
||
dotenv.config(); | ||
|
||
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_AI_API_KEY || ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use vercel ai
as this is what we use in the core app too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added google ai using vercel ai sdk
unsubscribe-automation/README.md
Outdated
@@ -0,0 +1,101 @@ | |||
# Unsubscribe Automation Service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to apps
folder. web
is one app. unsubscribe-automation
is another app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved
unsubscribe-automation/README.md
Outdated
## Contributing | ||
|
||
Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and the process for submitting pull requests. | ||
|
||
## License | ||
|
||
This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this. The project already has these that cover everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
unsubscribe-automation/src/server.ts
Outdated
methods: ["GET", "POST"], | ||
}); | ||
|
||
const UnsubscribeSchema = z.object({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start schemas with a lowercase letter. ie UnsubscribeSchema
.
This is minor, but just keeps it consistent with the rest of the project. (I've seen other projects use the style you have here, but we don't)
Also zod docs, start with lowercase: https://zod.dev/?id=basic-usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
unsubscribe-automation/src/server.ts
Outdated
|
||
// Register CORS | ||
server.register(cors, { | ||
origin: process.env.CORS_ORIGIN || "http://localhost:3000", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove: || "http://localhost:3000",
If you want to set it as an env var to be localhost:3000 you can.
In production this may not be set at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
unsubscribe-automation/src/main.ts
Outdated
for (const selector of unsubscribeSelectors) { | ||
const element = page.locator(selector); | ||
if ((await element.count()) > 0) { | ||
await element.click({ timeout: 5000 }).catch(console.warn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer wrapping in try/catch for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added try catch block
unsubscribe-automation/src/main.ts
Outdated
const unsubscribeSelectors = [ | ||
"button:has-text(/unsubscribe/i)", | ||
"a:has-text(/unsubscribe/i)", | ||
'input[type="submit"][value*="unsubscribe" i]', | ||
':text("unsubscribe")', | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially more robust solution provided by Claude:
import { Page, ElementHandle } from 'playwright';
async function performFallbackUnsubscribe(page: Page): Promise<boolean> {
const unsubscribeKeywords = [
'unsubscribe', // English
'désabonner', // French
'abbestellen', // German
'cancelar suscripción', // Spanish
'annulla iscrizione', // Italian
'退订', // Chinese (Simplified)
'退訂', // Chinese (Traditional)
'退会', // Japanese
'отписаться', // Russian
'se désabonner', // Alternative French
'désinscription', // Another French alternative
'abmelden', // Alternative German
'darse de baja', // Alternative Spanish
];
const generateSelectors = (keyword: string) => [
`button:has-text("${keyword}")`,
`a:has-text("${keyword}")`,
`input[type="submit"][value*="${keyword}" i]`,
`:text("${keyword}")`,
`[aria-label*="${keyword}" i]`,
`[title*="${keyword}" i]`,
];
const allSelectors = unsubscribeKeywords.flatMap(generateSelectors);
for (const selector of allSelectors) {
try {
const elements = await page.$$(selector);
if (elements.length > 0) {
// Filter visible elements
const visibleElements = await Promise.all(
elements.map(async (el) => (await el.isVisible()) ? el : null)
);
const element = visibleElements.find((el) => el !== null) as ElementHandle<Element> | undefined;
if (element) {
await element.click({ timeout: 5000 });
console.log(`Successfully clicked unsubscribe element: ${selector}`);
return true;
}
}
} catch (error) {
console.warn(`Error trying to click ${selector}:`, error);
}
}
console.log('No unsubscribe element found or clicked in fallback strategy');
return false;
}
And it's explanation in case you disagree with its reasoning:
Here are the key improvements made to the function:
Return value: The function now returns a boolean indicating success or failure, which can be useful for error handling or flow control in the calling code.
Error handling: Each selector attempt is now wrapped in a try-catch block, allowing the function to continue trying other selectors if one fails.
Visibility check: The function now checks for element visibility before attempting to click, which can help avoid clicking on hidden elements.
Use of $$ instead of locator: This change allows for more precise element selection and easier handling of multiple matching elements.
Timeout handling: The click timeout is still set to 5000ms, but now it's handled more gracefully within the try-catch block.
TypeScript types: Added proper type annotations for better type safety and code clarity.
Flexibility: The selectors array is now easier to extend with additional selectors if needed.
These changes should make the function more robust and reliable for finding and clicking unsubscribe elements across various webpage structures. The function will try all selectors, handle errors gracefully, and only attempt to click visible elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, need to add support for more complex unsubscribe pages like
select reasons for unsubscribing
unsubscribe-automation/src/main.ts
Outdated
|
||
export async function autoUnsubscribe(url: string): Promise<boolean> { | ||
// Remove headless: false if you don't want the browser popup to open | ||
const browser = await chromium.launch({ headless: false }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can set this as an env var. In prod we will likely use headless: true. But it is easier in dev to see it headful mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an env variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked half of it and left comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (9)
apps/unsubscribe-automation/.env.example (4)
1-1
: Add a comment explaining the GOOGLE_GENERATIVE_AI_API_KEY.While it's good practice to leave the API key value empty in the .env.example file, consider adding a comment explaining the purpose of this key and how developers can obtain it. This will help new contributors set up the project more easily.
Consider adding a comment like this:
+# Google Generative AI API key for LLM integration +# Obtain this key from the Google Cloud Console: https://console.cloud.google.com/ GOOGLE_GENERATIVE_AI_API_KEY=
2-3
: LGTM! Consider adding a note about production settings.The CORS_ORIGIN variable is well-documented and appropriately set for development.
Consider expanding the comment to remind developers about production settings:
-# If required, set CORS_ORIGIN to allow requests from your frontend +# If required, set CORS_ORIGIN to allow requests from your frontend +# Note: Update this value for production environments CORS_ORIGIN="http://localhost:3000"
4-4
: Add a comment explaining NODE_ENV and its possible values.While setting NODE_ENV to "development" is appropriate for a .env.example file, it would be helpful to add a comment explaining the significance of this variable and mentioning other possible values.
Consider adding a comment like this:
+# Node environment: development, test, or production +# This affects various behaviors in Node.js applications NODE_ENV="development"
1-4
: Consider adding a header comment to the file.To improve the overall documentation of the project, consider adding a header comment to this .env.example file. This comment could explain the purpose of the file and provide instructions for developers on how to use it.
Here's a suggested header comment:
+# This is an example environment file for the unsubscribe-automation service. +# Copy this file to .env and fill in the appropriate values for your environment. +# Do not commit the actual .env file to version control. GOOGLE_GENERATIVE_AI_API_KEY= # If required, set CORS_ORIGIN to allow requests from your frontend CORS_ORIGIN="http://localhost:3000" NODE_ENV="development"apps/unsubscribe-automation/package.json (3)
1-5
: Enhance project metadata for better documentation.Consider making the following improvements to the project metadata:
- Add a concise description of the project in the "description" field.
- Update the "main" field to "dist/server.js" to match the TypeScript output.
- Add relevant keywords to improve discoverability.
- Specify the author or organization responsible for the project.
These changes will provide better context and improve the project's documentation.
Also applies to: 11-13
6-10
: Enhance scripts for development and production use.The current scripts are a good start, but consider the following improvements:
- Add a production start script that runs the compiled JavaScript:
"start:prod": "node dist/server.js"
- Rename the current start script to emphasize its use in development:
"start:dev": "ts-node -r tsconfig-paths/register src/server.ts"
- Update the test script to use a testing framework like Jest:
"test": "jest"
- Add a lint script for code quality:
"lint": "eslint . --ext .ts"
These changes will provide a more comprehensive set of scripts for development, testing, and production use.
14-21
: Enhance devDependencies for improved development experience.Consider the following suggestions for the devDependencies:
- Use a caret range for @types/node to allow minor version updates:
"@types/node": "^22.5.4"
- Add ESLint for code linting:
"eslint": "^8.0.0"
- Add Jest for testing:
"jest": "^29.0.0"
and"@types/jest": "^29.0.0"
- Consider adding
prettier
for code formattingThese additions will improve code quality, enable testing, and enhance the overall development experience.
apps/unsubscribe-automation/src/server.ts (1)
48-48
: Useserver.log.info
instead ofconsole.log
for consistent logging.For consistency and better integration with Fastify's logging system, replace
console.log
withserver.log.info
.Apply this diff to update the logging method:
await server.listen({ port, host: "0.0.0.0" }); - console.log(`Server is running on ${port}`); + server.log.info(`Server is running on ${port}`);apps/unsubscribe-automation/src/main.ts (1)
85-99
: Add a default case to handle unexpected action typesThe switch statement lacks a default case. Adding one can help catch and log any unexpected action types, improving robustness.
Here's the suggested addition:
switch (action.type) { case "click": case "submit": await locator.click({ timeout: 5000 }); break; case "fill": if (action.value) { await locator.fill(action.value, { timeout: 5000 }); } break; case "select": if (action.value) { await locator.selectOption(action.value, { timeout: 5000 }); } break; + default: + console.warn(`Unhandled action type: ${action.type}`); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (6)
- apps/unsubscribe-automation/.env.example (1 hunks)
- apps/unsubscribe-automation/README.md (1 hunks)
- apps/unsubscribe-automation/package.json (1 hunks)
- apps/unsubscribe-automation/src/main.ts (1 hunks)
- apps/unsubscribe-automation/src/server.ts (1 hunks)
- apps/unsubscribe-automation/tsconfig.json (1 hunks)
✅ Files skipped from review due to trivial changes (1)
- apps/unsubscribe-automation/tsconfig.json
🧰 Additional context used
🪛 LanguageTool
apps/unsubscribe-automation/README.md
[uncategorized] ~74-~74: Loose punctuation mark.
Context: ...use OpenAI. ## API Endpoints -GET /
: Health check endpoint - `POST /unsubscr...(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
apps/unsubscribe-automation/README.md
55-55: null
Bare URL used(MD034, no-bare-urls)
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
37-37: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
45-45: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
52-52: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
81-81: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
86-86: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
🪛 Biome
apps/unsubscribe-automation/src/main.ts
[error] 222-233: This else clause can be omitted because previous branches break early.
Unsafe fix: Omit the else clause.
(lint/style/noUselessElse)
[error] 30-30: Reassigning a function parameter is confusing.
The parameter is declared here:
Use a local variable instead.
(lint/style/noParameterAssign)
apps/unsubscribe-automation/src/server.ts
[error] 46-46: Use Number.parseInt instead of the equivalent global.
ES2015 moved some globals into the Number namespace for consistency.
Safe fix: Use Number.parseInt instead.(lint/style/useNumberNamespace)
🔇 Additional comments (2)
apps/unsubscribe-automation/package.json (1)
22-31
: Review and optimize dependencies for consistency and stability.Consider the following suggestions for the dependencies:
- Evaluate the need for multiple AI-related packages (@ai-sdk/google, @ai-sdk/openai, @google/generative-ai, ai). Consider consolidating to reduce potential conflicts and simplify maintenance.
- Pin the fastify version to avoid potential breaking changes:
"fastify": "5.0.0"
- Ensure that the versions of @ai-sdk/google and @ai-sdk/openai are compatible with each other and with the @google/generative-ai package.
- Consider adding a package for structured logging, such as pino, which integrates well with Fastify.
These changes will help maintain consistency and stability in your project dependencies.
To verify the compatibility of AI-related packages, run the following script:
apps/unsubscribe-automation/src/main.ts (1)
178-179
: Verify browser headless mode configurationThe browser is launched in headful mode when not in production. If this is for debugging purposes, it's acceptable. Otherwise, consider running in headless mode consistently.
Is the headful mode intended for development debugging? If not, you might want to change it:
const isProduction = process.env.NODE_ENV === "production"; - const browser = await chromium.launch({ headless: isProduction }); + const browser = await chromium.launch({ headless: true });
|
||
const unsubscribeSchema = z.object({ | ||
url: z.string().url(), | ||
useGoogle: z.boolean().optional(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useGoogle
parameter is defined but not utilized.
The useGoogle
parameter is included in the unsubscribeSchema
but isn't extracted or used in the handler. If this parameter is meaningful for the autoUnsubscribe
function, consider extracting it and passing it along.
Apply this diff to utilize the useGoogle
parameter:
const unsubscribeSchema = z.object({
url: z.string().url(),
useGoogle: z.boolean().optional(),
});
server.post("/unsubscribe", async (request, reply) => {
try {
- const { url } = unsubscribeSchema.parse(request.body);
+ const { url, useGoogle } = unsubscribeSchema.parse(request.body);
- const success = await autoUnsubscribe(url);
+ const success = await autoUnsubscribe(url, useGoogle);
return {
success,
Ensure that the autoUnsubscribe
function is updated to accept the useGoogle
parameter if it doesn't already.
Also applies to: 25-26
|
||
const start = async () => { | ||
try { | ||
const port = process.env.PORT ? parseInt(process.env.PORT) : 5000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Number.parseInt
instead of the global parseInt
.
ES2015 introduced Number.parseInt
for consistency and to avoid potential issues with the global function. It's recommended to use Number.parseInt
for clarity.
Apply this diff to fix the issue:
const start = async () => {
try {
const port = process.env.PORT ?
- parseInt(process.env.PORT) : 5000;
+ Number.parseInt(process.env.PORT) : 5000;
await server.listen({ port, host: "0.0.0.0" });
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const port = process.env.PORT ? parseInt(process.env.PORT) : 5000; | |
const port = process.env.PORT ? Number.parseInt(process.env.PORT) : 5000; |
🧰 Tools
🪛 Biome
[error] 46-46: Use Number.parseInt instead of the equivalent global.
ES2015 moved some globals into the Number namespace for consistency.
Safe fix: Use Number.parseInt instead.(lint/style/useNumberNamespace)
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | ||
const maxContentLength = 20000; // Adjust based on API limitations | ||
if (pageContent.length > maxContentLength) { | ||
console.warn( | ||
`Page content exceeds ${maxContentLength} characters. Truncating...`, | ||
); | ||
pageContent = pageContent.substring(0, maxContentLength); | ||
} | ||
|
||
const prompt = ` | ||
Analyze the following HTML content and determine the actions needed to unsubscribe from an email newsletter. | ||
Provide a JSON object with: | ||
1. An 'actions' array containing steps to unsubscribe. Each action should have: | ||
- 'type': Either 'click', 'fill', 'select', or 'submit' | ||
- 'selector': A CSS selector for the element. Use only standard CSS selectors (e.g., '#id', '.class', 'tag', '[attribute]'). | ||
- 'value': (Optional) For input fields. Omit this field if not applicable. | ||
2. A 'confirmationIndicator' string to verify success. This should be a CSS selector for an element that appears after successful unsubscription. If uncertain, set to null. | ||
|
||
Return ONLY the JSON object, without any markdown formatting, code blocks, or explanation. | ||
|
||
HTML Content: | ||
${pageContent} | ||
`; | ||
|
||
const { text: analysisText } = await generateText({ | ||
model: google("gemini-1.5-flash"), | ||
prompt: prompt, | ||
}); | ||
|
||
try { | ||
// Remove any markdown code block indicators | ||
const cleanedText = analysisText.replace(/```json\n?|\n?```/g, "").trim(); | ||
const parsedAnalysis = JSON.parse(cleanedText); | ||
return pageAnalysisSchema.parse(parsedAnalysis); | ||
} catch (error) { | ||
console.error("Error parsing AI response:", error); | ||
console.error("Raw AI response:", analysisText); | ||
throw new Error("Failed to parse AI response"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure compliance with data privacy when sending page content to AI services
The function analyzePageWithAI
sends page content to an external AI service. If the page contains sensitive information, this could raise privacy concerns. Please verify compliance with data privacy regulations and consider sanitizing the content before transmission.
Would you like assistance in implementing content sanitization or establishing guidelines for handling sensitive data?
🧰 Tools
🪛 Biome
[error] 30-30: Reassigning a function parameter is confusing.
The parameter is declared here:
Use a local variable instead.
(lint/style/noParameterAssign)
if (confirmationFound) { | ||
console.log("Unsubscribe confirmation found."); | ||
return true; | ||
} else { | ||
console.log("Unsubscribe action performed, but confirmation not found."); | ||
// Only take screenshot if not in production | ||
if (!isProduction) { | ||
await page.screenshot({ | ||
path: "final-state-screenshot.png", | ||
fullPage: true, | ||
}); | ||
} | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplify code by removing unnecessary 'else' clause
Since the if
block at line 218 returns a value, the subsequent else
clause is redundant and can be removed for cleaner code.
Here's the refactored code:
if (confirmationFound) {
console.log("Unsubscribe confirmation found.");
return true;
}
- else {
console.log("Unsubscribe action performed, but confirmation not found.");
// Only take screenshot if not in production
if (!isProduction) {
await page.screenshot({
path: "final-state-screenshot.png",
fullPage: true,
});
}
return false;
- }
+ console.log("Unsubscribe action performed, but confirmation not found.");
+ // Only take screenshot if not in production
+ if (!isProduction) {
+ await page.screenshot({
+ path: "final-state-screenshot.png",
+ fullPage: true,
+ });
+ }
+ return false;
By removing the else
clause, the code after the if
condition executes only when confirmationFound
is false
, improving readability.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (confirmationFound) { | |
console.log("Unsubscribe confirmation found."); | |
return true; | |
} else { | |
console.log("Unsubscribe action performed, but confirmation not found."); | |
// Only take screenshot if not in production | |
if (!isProduction) { | |
await page.screenshot({ | |
path: "final-state-screenshot.png", | |
fullPage: true, | |
}); | |
} | |
return false; | |
} | |
if (confirmationFound) { | |
console.log("Unsubscribe confirmation found."); | |
return true; | |
} | |
console.log("Unsubscribe action performed, but confirmation not found."); | |
// Only take screenshot if not in production | |
if (!isProduction) { | |
await page.screenshot({ | |
path: "final-state-screenshot.png", | |
fullPage: true, | |
}); | |
} | |
return false; |
const { text: analysisText } = await generateText({ | ||
model: google("gemini-1.5-flash"), | ||
prompt: prompt, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for the AI service call
The call to generateText
may fail due to network issues or API errors. Consider adding a try-catch block to handle potential exceptions from the AI service.
Here's a suggested modification:
+ let analysisText: string;
+ try {
const { text } = await generateText({
model: google("gemini-1.5-flash"),
prompt: prompt,
});
+ analysisText = text;
+ } catch (error) {
+ console.error("Error generating AI analysis:", error);
+ throw new Error("Failed to generate AI analysis");
+ }
This ensures that any errors from the AI service are appropriately caught and handled.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const { text: analysisText } = await generateText({ | |
model: google("gemini-1.5-flash"), | |
prompt: prompt, | |
}); | |
let analysisText: string; | |
try { | |
const { text } = await generateText({ | |
model: google("gemini-1.5-flash"), | |
prompt: prompt, | |
}); | |
analysisText = text; | |
} catch (error) { | |
console.error("Error generating AI analysis:", error); | |
throw new Error("Failed to generate AI analysis"); | |
} |
if (pageContent.length > maxContentLength) { | ||
console.warn( | ||
`Page content exceeds ${maxContentLength} characters. Truncating...`, | ||
); | ||
pageContent = pageContent.substring(0, maxContentLength); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid reassigning function parameters; use a local variable instead
Reassigning the function parameter pageContent
can lead to confusion and unintended side effects. It's better practice to assign the modified value to a new local variable.
Here's a suggested fix:
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> {
const maxContentLength = 20000; // Adjust based on API limitations
+ let contentToAnalyze = pageContent;
- if (pageContent.length > maxContentLength) {
+ if (contentToAnalyze.length > maxContentLength) {
console.warn(
`Page content exceeds ${maxContentLength} characters. Truncating...`,
);
- pageContent = pageContent.substring(0, maxContentLength);
+ contentToAnalyze = contentToAnalyze.substring(0, maxContentLength);
}
Then, replace instances of pageContent
with contentToAnalyze
where appropriate in the rest of the function.
Committable suggestion was skipped due to low confidence.
🧰 Tools
🪛 Biome
[error] 30-30: Reassigning a function parameter is confusing.
The parameter is declared here:
Use a local variable instead.
(lint/style/noParameterAssign)
if (elements.length > 0) { | ||
// Filter visible elements | ||
const visibleElements = await Promise.all( | ||
elements.map(async (el) => ((await el.isVisible()) ? el : null)), | ||
); | ||
const element = visibleElements.find((el) => el !== null) as | ||
| ElementHandle<Element> | ||
| undefined; | ||
|
||
if (element) { | ||
await element.click({ timeout: 5000 }); | ||
console.log(`Successfully clicked unsubscribe element: ${selector}`); | ||
return true; | ||
} | ||
} | ||
} catch (error) { | ||
console.warn(`Error trying to click ${selector}:`, error); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimize visibility checks in the fallback strategy
In performFallbackUnsubscribe
, using Promise.all
for visibility checks can be resource-intensive. Consider using Playwright's locator.filter()
method for more efficient element selection.
Refactored code:
for (const selector of allSelectors) {
try {
- const elements = await page.$$(selector);
- if (elements.length > 0) {
- // Filter visible elements
- const visibleElements = await Promise.all(
- elements.map(async (el) => ((await el.isVisible()) ? el : null)),
- );
- const element = visibleElements.find((el) => el !== null) as
- | ElementHandle<Element>
- | undefined;
+ const element = await page.locator(selector).filter({ hasText: /./ }).first();
if (element) {
await element.click({ timeout: 5000 });
console.log(`Successfully clicked unsubscribe element: ${selector}`);
return true;
}
} catch (error) {
console.warn(`Error trying to click ${selector}:`, error);
}
}
This approach leverages Playwright's built-in methods for a more efficient and readable solution.
Committable suggestion was skipped due to low confidence.
- Added a timeout if AI takes too long to respond - Added a maximum retry count for actions
From a quick overview it's looking good! I'm going to test later today! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (6)
apps/unsubscribe-automation/README.md (6)
1-11
: Enhance the introduction with technology stack information.The introduction provides a clear overview of the service. Consider adding a brief mention of the key technologies used (e.g., Playwright, AI models) to give readers an immediate understanding of the technical foundation.
13-18
: Add resource links for API keys.To assist users in obtaining the required API keys, consider adding links to the relevant Google AI and OpenAI documentation or sign-up pages.
44-64
: Add information about the API response format.The usage instructions and curl example are helpful. Consider adding information about the expected response format from the API to give users a complete picture of the interaction.
🧰 Tools
🪛 Markdownlint
50-50: null
Bare URL used(MD034, no-bare-urls)
47-47: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
66-69
: Expand API endpoint documentation.While the endpoints are listed, consider adding more detailed documentation for each endpoint. Include information such as:
- Required headers
- Request body format (for POST requests)
- Possible response codes
- Response body format
This will provide developers with a more comprehensive understanding of how to interact with the API.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~68-~68: Loose punctuation mark.
Context: ...scribe URL. ## API Endpoints -GET /
: Health check endpoint - `POST /unsubscr...(UNLIKELY_OPENING_PUNCTUATION)
84-87
: Expand the troubleshooting section.The current troubleshooting tips are helpful. Consider expanding this section to cover more potential issues that users might encounter, such as:
- Common errors and their solutions
- Steps for debugging AI-related issues
- How to report bugs or request support
This will help users resolve problems more efficiently and improve their overall experience with the service.
24-24
: Address Markdown linting issues.To improve the formatting and readability of the README, please address the following:
Specify languages for all fenced code blocks. For example:
-``` +```bashReplace the bare URL on line 50 with a proper Markdown link:
-The server will start on http://localhost:5000 by default. +The server will start on [http://localhost:5000](http://localhost:5000) by default.These changes will enhance the document's consistency and render better in Markdown viewers.
Also applies to: 30-30, 37-37, 47-47, 75-75, 80-80, 50-50
🧰 Tools
🪛 Markdownlint
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- apps/unsubscribe-automation/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
apps/unsubscribe-automation/README.md
[uncategorized] ~68-~68: Loose punctuation mark.
Context: ...scribe URL. ## API Endpoints -GET /
: Health check endpoint - `POST /unsubscr...(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
apps/unsubscribe-automation/README.md
50-50: null
Bare URL used(MD034, no-bare-urls)
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
37-37: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
47-47: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
75-75: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
80-80: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🧹 Outside diff range and nitpick comments (10)
apps/unsubscriber/.env.example (2)
2-3
: LGTM! Consider adding a production security reminder.The
CORS_ORIGIN
setting and its accompanying comment are clear and appropriate for local development.Consider adding a comment reminding developers to set this to a specific, secure origin in production environments. For example:
# If required, set CORS_ORIGIN to allow requests from your frontend CORS_ORIGIN="http://localhost:3000" +# Note: In production, set this to your frontend's specific origin for security
4-4
: LGTM! Consider adding an explanatory comment.The
NODE_ENV
setting is appropriate for a development environment.Consider adding a brief comment explaining the significance of this variable and reminding developers to change it for production. For example:
+# Set the Node.js environment (use "production" for production deployments) NODE_ENV="development"
apps/unsubscriber/README.md (6)
20-43
: Enhance installation instructionsThe installation steps are clear, but there are a few points to address:
- The .env file example doesn't include the OpenAI API key mentioned in the prerequisites. If OpenAI is being used, please add it to the example.
- Consider adding brief instructions or links on how to obtain the required API keys.
- The project directory in the cd command (
apps/unsubscribe-automation
) differs from the one in the PR objectives (apps/unsubscriber
). Please verify the correct directory name.Here's a suggested update for the .env file example:
GOOGLE_GENERATIVE_AI_API_KEY=your_google_ai_api_key_here +OPENAI_API_KEY=your_openai_api_key_here CORS_ORIGIN=http://localhost:3000
Also, consider adding:
For instructions on obtaining API keys, visit: - Google AI: [link to Google AI API key instructions] - OpenAI (if applicable): [link to OpenAI API key instructions]🧰 Tools
🪛 Markdownlint
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
37-37: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
44-65
: Enhance usage exampleThe instructions for running the service and using the API are clear. To further improve the usage section:
- Consider adding an example of the expected response format from the API.
- Provide a brief explanation of what happens after the request is sent (e.g., "The service will attempt to unsubscribe and return a status message").
- Add a note about error handling (e.g., "If the unsubscribe process fails, the API will return an error message with details").
Here's a suggested addition to the usage section:
Example response: ```json { "success": true, "message": "Successfully unsubscribed from the newsletter." }The service will attempt to unsubscribe using the provided URL and return a status message. If the process fails, you'll receive an error message with details about the failure.
<details> <summary>🧰 Tools</summary> <details> <summary>🪛 Markdownlint</summary><blockquote> 50-50: null Bare URL used (MD034, no-bare-urls) --- 47-47: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </blockquote></details> </details> --- `66-70`: **Expand API endpoints documentation** While the API endpoints are listed, providing more detailed information would be beneficial for developers integrating with the service. Consider expanding this section with: 1. HTTP method for each endpoint (GET is specified for the health check, but not for the unsubscribe endpoint) 2. Expected request format for the unsubscribe endpoint 3. Possible response codes and their meanings 4. Any rate limiting or authentication requirements Here's a suggested expansion of the API endpoints section: ```markdown ## API Endpoints 1. Health Check - Method: GET - Endpoint: `/` - Description: Verifies if the service is running - Success Response: 200 OK 2. Unsubscribe - Method: POST - Endpoint: `/unsubscribe` - Description: Triggers the unsubscribe process for a given URL - Request Body: ```json { "url": "https://example.com/unsubscribe" } ``` - Success Response: 200 OK - Error Response: 400 Bad Request, 500 Internal Server Error Note: Ensure your requests include the appropriate `Content-Type: application/json` header.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~68-~68: Loose punctuation mark.
Context: ...scribe URL. ## API Endpoints -GET /
: Health check endpoint - `POST /unsubscr...(UNLIKELY_OPENING_PUNCTUATION)
71-83
: Enhance development sectionThe development instructions are clear, but could be expanded to provide more value to developers. Consider adding:
- A brief explanation of what "development mode with hot reloading" means
- Instructions for running tests (if applicable)
- Linting commands (if used in the project)
- Any other relevant development workflows or scripts
Here's a suggested expansion of the development section:
## Development - To run the service in development mode with hot reloading (automatically restarts on file changes):pnpm run dev
- To build the TypeScript files:
pnpm run build
- To run tests:
pnpm test
- To lint the codebase:
pnpm run lint
For more information on the available scripts, refer to the `scripts` section in `package.json`.
🧰 Tools
🪛 Markdownlint
75-75: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
80-80: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
84-87
: Expand troubleshooting sectionThe troubleshooting section provides good initial guidance. To make it more comprehensive, consider adding:
- Common error messages and their potential solutions
- Steps to verify API key validity
- Instructions for enabling verbose logging
- Information on where to find log files
- Links to relevant documentation or support channels
Here's a suggested expansion of the troubleshooting section:
## Troubleshooting - If you encounter issues with Playwright, ensure you have the necessary system dependencies installed. Refer to the [Playwright installation guide](https://playwright.dev/docs/intro#installation) for more information. - Check the console output for error messages or logs that might indicate the cause of any issues. - Common issues and solutions: 1. "API key invalid": Verify that your API keys in the .env file are correct and properly formatted. 2. "CORS error": Ensure the `CORS_ORIGIN` in your .env file matches your frontend application's URL. 3. "Timeout error": Check your internet connection and try increasing the timeout value in the configuration. - To enable verbose logging, set the `DEBUG` environment variable:DEBUG=* pnpm start
- Log files are located in the `logs` directory at the root of the project. For further assistance, please check our [FAQ](link-to-faq) or open an issue on our [GitHub repository](link-to-repo).
24-24
: Improve markdown formattingTo enhance the readability and consistency of the README, please address the following markdown formatting issues:
- Specify languages for all fenced code blocks. For example, change
to
bash for shell commands and ```env for environment variables.- Avoid using bare URLs. Instead, use the markdown link syntax: text.
Here's an example of how to update the fenced code blocks:
- ``` + ```bash cd apps/unsubscribe-automation ``` - ``` + ```bash pnpm install ``` - ``` + ```env GOOGLE_GENERATIVE_AI_API_KEY=your_google_ai_api_key_here CORS_ORIGIN=http://localhost:3000 ```And for the bare URL on line 50:
- The server will start on http://localhost:5000 by default. + The server will start on [http://localhost:5000](http://localhost:5000) by default.Also applies to: 30-30, 37-37, 47-47, 50-50, 75-75, 80-80
🧰 Tools
🪛 Markdownlint
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
apps/unsubscriber/src/server.ts (1)
49-49
: Useserver.log.info
instead ofconsole.log
for consistent loggingTo maintain consistent logging throughout your application, replace
console.log
withserver.log.info
.Apply this diff to make the change:
- console.log(`Server is running on ${port}`); + server.log.info(`Server is running on ${port}`);apps/unsubscriber/src/main.ts (1)
210-212
: Consider Headless Mode for ProductionLaunching the browser in non-headless mode in production may impact performance and security.
Ensure that
headless
is set appropriately for production environments.const isProduction = process.env.NODE_ENV === "production"; -const browser = await chromium.launch({ headless: isProduction }); +const browser = await chromium.launch({ headless: true });If you need visual feedback in non-production, you can set
headless: false
for development environments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
- apps/unsubscriber/.env.example (1 hunks)
- apps/unsubscriber/README.md (1 hunks)
- apps/unsubscriber/package.json (1 hunks)
- apps/unsubscriber/src/main.ts (1 hunks)
- apps/unsubscriber/src/server.ts (1 hunks)
- apps/unsubscriber/tsconfig.json (1 hunks)
✅ Files skipped from review due to trivial changes (2)
- apps/unsubscriber/package.json
- apps/unsubscriber/tsconfig.json
🧰 Additional context used
🪛 LanguageTool
apps/unsubscriber/README.md
[uncategorized] ~68-~68: Loose punctuation mark.
Context: ...scribe URL. ## API Endpoints -GET /
: Health check endpoint - `POST /unsubscr...(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
apps/unsubscriber/README.md
50-50: null
Bare URL used(MD034, no-bare-urls)
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
37-37: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
47-47: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
75-75: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
80-80: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
🔇 Additional comments (5)
apps/unsubscriber/.env.example (1)
1-1
: LGTM! Ensure API key security.The empty declaration for
GOOGLE_GENERATIVE_AI_API_KEY
is appropriate for a.env.example
file.To verify that no actual API keys are committed, run the following script:
If the script returns any results, review those files to ensure no actual API keys are committed.
✅ Verification successful
Verification Successful: No API Key Leaks Found
The search confirmed that no
.env
files contain non-emptyGOOGLE_GENERATIVE_AI_API_KEY
values.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for potential API key leaks in .env files # Test: Search for .env files with non-empty GOOGLE_GENERATIVE_AI_API_KEY rg -g '.env*' 'GOOGLE_GENERATIVE_AI_API_KEY=.+' --no-filenameLength of output: 61
Script:
#!/bin/bash # Description: Verify no non-empty GOOGLE_GENERATIVE_AI_API_KEY is committed in .env files # Corrected search pattern with proper quoting rg -g '.env*' 'GOOGLE_GENERATIVE_AI_API_KEY=.+' --no-filenameLength of output: 61
Script:
#!/bin/bash # Description: Verify no non-empty GOOGLE_GENERATIVE_AI_API_KEY is committed in .env files using grep # Search for non-empty GOOGLE_GENERATIVE_AI_API_KEY assignments in all .env files grep -E 'GOOGLE_GENERATIVE_AI_API_KEY=.+' .env* --no-filenameLength of output: 102
Script:
#!/bin/bash # Description: Recursively search for non-empty GOOGLE_GENERATIVE_AI_API_KEY in all .env* files # Find all .env* files recursively and search for non-empty GOOGLE_GENERATIVE_AI_API_KEY fd -e env -t f . | xargs grep -H 'GOOGLE_GENERATIVE_AI_API_KEY=.+' || echo "No non-empty GOOGLE_GENERATIVE_AI_API_KEY found."Length of output: 173
apps/unsubscriber/README.md (2)
1-12
: LGTM: Clear and comprehensive introductionThe introduction and features section provide a concise yet comprehensive overview of the Unsubscribe Automation Service. The listed features align well with the PR objectives, covering key aspects such as AI-powered analysis, Playwright automation, API endpoints, and integration support.
13-19
: Clarify the use of OpenAIThe prerequisites section mentions an OpenAI API key, but the PR objectives only specify the use of Google's Generative AI. Could you please clarify if OpenAI is also being used in this service? If not, consider removing the OpenAI API key requirement to avoid confusion.
apps/unsubscriber/src/main.ts (2)
247-248
: Handle Errors When Waiting for Network Idle StateErrors thrown by
waitForLoadState
are currently caught but may mask underlying issues.Ensure that swallowing these errors won't hide important issues. If necessary, handle specific timeout errors and log appropriately.
await page .waitForLoadState("networkidle", { timeout: 10000 }) .catch((error) => { - console.warn( + console.error( "Error waiting for network idle state after actions:", error, ); });
164-170
: Verify the Use of Non-Standard SelectorsThe selector
:text("${keyword}")
may not be a standard or supported selector in Playwright.Confirm that all selectors in
generateSelectors
are valid in Playwright. Adjust if necessary to use supported selector engines.const generateSelectors = (keyword: string) => [ `button:has-text("${keyword}")`, `a:has-text("${keyword}")`, `input[type="submit"][value*="${keyword}" i]`, - `:text("${keyword}")`, + `text="${keyword}"`, `[aria-label*="${keyword}" i]`, `[title*="${keyword}" i]`, ];In Playwright,
text=
selector might be more appropriate for selecting elements by their text content.
GOOGLE_GENERATIVE_AI_API_KEY= | ||
# If required, set CORS_ORIGIN to allow requests from your frontend | ||
CORS_ORIGIN="http://localhost:3000" | ||
NODE_ENV="development" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Sensitive Data Detected in .env.example
- The following environment variables contain non-empty values:
DATABASE_URL="postgresql://postgres:password@localhost:5432/inboxzero?schema=public"
DIRECT_URL="postgresql://postgres:password@localhost:5432/inboxzero?schema=public"
NEXTAUTH_URL=http://localhost:3000
GOOGLE_PUBSUB_TOPIC_NAME="projects/abc/topics/xyz"
Please ensure that these values are placeholders and do not contain actual sensitive information. It's recommended to keep all sensitive data out of version control by leaving values empty or clearly indicating they are placeholders.
🔗 Analysis chain
Overall, the .env.example file looks good. Ensure proper security practices.
The file provides a good template for required environment variables. Remember to:
- Keep actual API keys and sensitive data out of version control.
- Update
CORS_ORIGIN
andNODE_ENV
appropriately for production deployments. - Educate team members on the importance of environment variable management and security.
To ensure no sensitive data is accidentally committed, run this final check:
Review any non-empty values returned by this script to confirm they are safe defaults, not actual sensitive data.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify no actual sensitive data in .env.example
# Test: Check for non-empty values in .env.example
rg -g '.env.example' '=.+' --no-filename
Length of output: 1273
import fastify from "fastify"; | ||
import cors from "@fastify/cors"; | ||
import { autoUnsubscribe } from "./main"; | ||
import { z } from "zod"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import ZodError
directly for accurate type checking
Currently, you're checking error instanceof z.ZodError
, but ZodError
should be imported directly from "zod"
for proper type checking and to avoid potential issues if the z
namespace changes.
Apply this diff to fix the issue:
import { z } from "zod";
+import { ZodError } from "zod";
...
- if (error instanceof z.ZodError) {
+ if (error instanceof ZodError) {
Also applies to: 33-34
return { | ||
success, | ||
message: success | ||
? "Unsubscribed successfully" | ||
: "Unsubscribed but confirmation not found", | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Return appropriate HTTP status codes based on the operation result
The API currently always returns HTTP 200 OK, even when unsubscription is not confirmed. To provide clearer feedback to clients, consider returning different HTTP status codes based on the success
value.
Apply this diff to set the status code accordingly:
const success = await autoUnsubscribe(url);
-return {
+const statusCode = success ? 200 : 400;
+return reply.status(statusCode).send({
success,
message: success
? "Unsubscribed successfully"
: "Unsubscribed but confirmation not found",
-};
+});
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
return { | |
success, | |
message: success | |
? "Unsubscribed successfully" | |
: "Unsubscribed but confirmation not found", | |
}; | |
const statusCode = success ? 200 : 400; | |
return reply.status(statusCode).send({ | |
success, | |
message: success | |
? "Unsubscribed successfully" | |
: "Unsubscribed but confirmation not found", | |
}); |
apps/unsubscriber/src/main.ts
Outdated
console.error("Error parsing AI response:", error); | ||
console.error("Raw AI response:", analysisText); | ||
throw new Error("Failed to parse AI response"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid Logging Raw AI Responses to Prevent PII Leakage
Logging the raw AI response can expose sensitive information that may be included in the AI's output.
Consider removing or sanitizing the logging of the raw AI response to prevent potential PII leakage.
console.error("Error parsing AI response:", error);
-console.error("Raw AI response:", analysisText);
+// console.error("Raw AI response:", analysisText);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
console.error("Error parsing AI response:", error); | |
console.error("Raw AI response:", analysisText); | |
throw new Error("Failed to parse AI response"); | |
console.error("Error parsing AI response:", error); | |
// console.error("Raw AI response:", analysisText); | |
throw new Error("Failed to parse AI response"); |
apps/unsubscriber/src/main.ts
Outdated
const confirmationFound = analysis.confirmationIndicator | ||
? await page.$(analysis.confirmationIndicator).then(Boolean) | ||
: finalContent.toLowerCase().includes("unsubscribed") || | ||
finalContent.toLowerCase().includes("successfully"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance Confirmation Detection Logic to Avoid False Positives
Current confirmation logic checks for generic terms like "unsubscribed" or "successfully," which may lead to false positives.
Consider improving the confirmation detection by using more specific phrases or patterns related to unsubscription. You might also check for changes in the page URL or look for specific elements that confirm the unsubscription.
const confirmationFound = analysis.confirmationIndicator
? await page.$(analysis.confirmationIndicator).then(Boolean)
- : finalContent.toLowerCase().includes("unsubscribed") ||
- finalContent.toLowerCase().includes("successfully");
+ : /unsubscribed|you have been removed|successfully unsubscribed/i.test(finalContent);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const confirmationFound = analysis.confirmationIndicator | |
? await page.$(analysis.confirmationIndicator).then(Boolean) | |
: finalContent.toLowerCase().includes("unsubscribed") || | |
finalContent.toLowerCase().includes("successfully"); | |
const confirmationFound = analysis.confirmationIndicator | |
? await page.$(analysis.confirmationIndicator).then(Boolean) | |
: /unsubscribed|you have been removed|successfully unsubscribed/i.test(finalContent); |
apps/unsubscriber/src/main.ts
Outdated
if (!isProduction) { | ||
await page.screenshot({ | ||
path: "final-state-screenshot.png", | ||
fullPage: true, | ||
}); | ||
} | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle Screenshots Carefully to Protect Sensitive Information
Taking screenshots may capture sensitive user data displayed on the page.
Ensure that screenshots are handled securely, or consider masking sensitive areas before capturing. Alternatively, limit screenshot capture to controlled debugging environments.
if (!isProduction) {
+ // Consider masking sensitive information before taking a screenshot
await page.screenshot({
path: "final-state-screenshot.png",
fullPage: true,
});
}
Implementing a mechanism to blur or hide sensitive elements can mitigate this risk.
Committable suggestion was skipped due to low confidence.
apps/unsubscriber/src/main.ts
Outdated
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | ||
const maxContentLength = 20000; // Adjust based on API limitations | ||
let contentToAnalyze = pageContent; | ||
if (contentToAnalyze.length > maxContentLength) { | ||
console.warn( | ||
`Page content exceeds ${maxContentLength} characters. Truncating...`, | ||
); | ||
contentToAnalyze = contentToAnalyze.substring(0, maxContentLength); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential PII Leakage by Sending Raw HTML Content to AI Model
The analyzePageWithAI
function sends the raw HTML content of the page to the AI model for analysis. This content may include sensitive user information or PII.
Consider sanitizing the pageContent
to remove or mask any sensitive information before including it in the AI prompt. This helps protect user privacy and complies with data protection regulations.
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> {
+ // Sanitize the page content to remove sensitive information
+ const sanitizedContent = sanitizeContent(pageContent);
- let contentToAnalyze = pageContent;
+ let contentToAnalyze = sanitizedContent;
You may need to implement a sanitizeContent
function that removes PII or sensitive data from the HTML content.
Committable suggestion was skipped due to low confidence.
apps/unsubscriber/src/main.ts
Outdated
} catch (error) { | ||
console.error("Error during unsubscribe process:", error); | ||
// Only take screenshot if not in production | ||
if (!isProduction) { | ||
await page.screenshot({ path: "error-screenshot.png", fullPage: true }); | ||
} | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent PII Leakage in Error Screenshots
Capturing screenshots on errors might expose sensitive information if not handled properly.
As with previous screenshot handling, ensure that error screenshots do not contain PII. Store them securely and limit access to authorized personnel.
if (!isProduction) {
+ // Ensure no sensitive information is captured in the screenshot
await page.screenshot({ path: "error-screenshot.png", fullPage: true });
}
Committable suggestion was skipped due to low confidence.
apps/unsubscriber/src/main.ts
Outdated
console.log(`Attempting action: ${action.type} on ${action.selector}`); | ||
const locator = page.locator(action.selector); | ||
|
||
if ((await locator.count()) === 0) { | ||
console.warn(`Element not found: ${action.selector}`); | ||
break; | ||
} | ||
|
||
if (!(await locator.isVisible())) { | ||
console.warn(`Element not visible: ${action.selector}`); | ||
break; | ||
} | ||
|
||
switch (action.type) { | ||
case "click": | ||
case "submit": | ||
await locator.click({ timeout: 5000 }); | ||
break; | ||
case "fill": | ||
if (action.value) { | ||
await locator.fill(action.value, { timeout: 5000 }); | ||
} | ||
break; | ||
case "select": | ||
if (action.value) { | ||
await locator.selectOption(action.value, { timeout: 5000 }); | ||
} | ||
break; | ||
} | ||
console.log(`Action completed: ${action.type} on ${action.selector}`); | ||
break; // Success, exit retry loop | ||
} catch (error) { | ||
console.warn( | ||
`Failed to perform action: ${action.type} on ${action.selector}. Retry ${retries + 1}/${MAX_RETRIES}. Error: ${error}`, | ||
); | ||
retries++; | ||
if (retries >= MAX_RETRIES) { | ||
console.error( | ||
`Max retries reached for action: ${action.type} on ${action.selector}`, | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Simplify Retries and Enhance Error Handling in performUnsubscribeActions
The retry logic within performUnsubscribeActions
could be simplified, and error messages can be made clearer.
Consider refactoring the retry mechanism for better readability and maintainability.
for (const action of actions) {
- let retries = 0;
- while (retries < MAX_RETRIES) {
+ for (let retries = 0; retries < MAX_RETRIES; retries++) {
try {
console.log(`Attempting action: ${action.type} on ${action.selector}`);
const locator = page.locator(action.selector);
if ((await locator.count()) === 0) {
console.warn(`Element not found: ${action.selector}`);
break;
}
if (!(await locator.isVisible())) {
console.warn(`Element not visible: ${action.selector}`);
break;
}
switch (action.type) {
case "click":
case "submit":
await locator.click({ timeout: 5000 });
break;
case "fill":
if (action.value) {
await locator.fill(action.value, { timeout: 5000 });
}
break;
case "select":
if (action.value) {
await locator.selectOption(action.value, { timeout: 5000 });
}
break;
}
console.log(`Action completed: ${action.type} on ${action.selector}`);
break; // Success, exit retry loop
} catch (error) {
console.warn(
`Failed to perform action: ${action.type} on ${action.selector}. Retry ${retries + 1}/${MAX_RETRIES}. Error: ${error}`,
);
- retries++;
if (retries >= MAX_RETRIES - 1) {
console.error(
`Max retries reached for action: ${action.type} on ${action.selector}`,
);
}
}
// Add delay between retries
await page.waitForTimeout(1000);
}
// Add delay between actions to mimic human behavior
await page.waitForTimeout(2000);
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
console.log(`Attempting action: ${action.type} on ${action.selector}`); | |
const locator = page.locator(action.selector); | |
if ((await locator.count()) === 0) { | |
console.warn(`Element not found: ${action.selector}`); | |
break; | |
} | |
if (!(await locator.isVisible())) { | |
console.warn(`Element not visible: ${action.selector}`); | |
break; | |
} | |
switch (action.type) { | |
case "click": | |
case "submit": | |
await locator.click({ timeout: 5000 }); | |
break; | |
case "fill": | |
if (action.value) { | |
await locator.fill(action.value, { timeout: 5000 }); | |
} | |
break; | |
case "select": | |
if (action.value) { | |
await locator.selectOption(action.value, { timeout: 5000 }); | |
} | |
break; | |
} | |
console.log(`Action completed: ${action.type} on ${action.selector}`); | |
break; // Success, exit retry loop | |
} catch (error) { | |
console.warn( | |
`Failed to perform action: ${action.type} on ${action.selector}. Retry ${retries + 1}/${MAX_RETRIES}. Error: ${error}`, | |
); | |
retries++; | |
if (retries >= MAX_RETRIES) { | |
console.error( | |
`Max retries reached for action: ${action.type} on ${action.selector}`, | |
); | |
} | |
for (const action of actions) { | |
for (let retries = 0; retries < MAX_RETRIES; retries++) { | |
try { | |
console.log(`Attempting action: ${action.type} on ${action.selector}`); | |
const locator = page.locator(action.selector); | |
if ((await locator.count()) === 0) { | |
console.warn(`Element not found: ${action.selector}`); | |
break; | |
} | |
if (!(await locator.isVisible())) { | |
console.warn(`Element not visible: ${action.selector}`); | |
break; | |
} | |
switch (action.type) { | |
case "click": | |
case "submit": | |
await locator.click({ timeout: 5000 }); | |
break; | |
case "fill": | |
if (action.value) { | |
await locator.fill(action.value, { timeout: 5000 }); | |
} | |
break; | |
case "select": | |
if (action.value) { | |
await locator.selectOption(action.value, { timeout: 5000 }); | |
} | |
break; | |
} | |
console.log(`Action completed: ${action.type} on ${action.selector}`); | |
break; // Success, exit retry loop | |
} catch (error) { | |
console.warn( | |
`Failed to perform action: ${action.type} on ${action.selector}. Retry ${retries + 1}/${MAX_RETRIES}. Error: ${error}`, | |
); | |
if (retries >= MAX_RETRIES - 1) { | |
console.error( | |
`Max retries reached for action: ${action.type} on ${action.selector}`, | |
); | |
} | |
} | |
// Add delay between retries | |
await page.waitForTimeout(1000); | |
} | |
// Add delay between actions to mimic human behavior | |
await page.waitForTimeout(2000); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (13)
apps/unsubscriber/tsconfig.json (1)
14-15
: Include and exclude patterns look good, with a minor suggestion.The current include and exclude patterns are appropriate for most TypeScript projects:
- Including all files under the
src
directory is a common and good practice.- Excluding
node_modules
is necessary to prevent TypeScript from type-checking dependencies.Consider explicitly excluding test files and other non-production code if they exist in your project. This can help optimize the compilation process. For example:
"include": ["src/**/*"], -"exclude": ["node_modules"] +"exclude": ["node_modules", "**/*.test.ts", "**/*.spec.ts"]This change would exclude test files with
.test.ts
or.spec.ts
extensions from the production build.apps/unsubscriber/package.json (4)
1-6
: Update the main entry pointThe
"main"
field is set to"index.js"
, but the start and dev scripts use"src/server.ts"
. Consider updating the main entry point to match the actual entry file.Apply this change:
- "main": "index.js", + "main": "src/server.ts",
7-12
: Implement actual testsThe test script is currently a placeholder. Consider implementing actual tests for the project to ensure code quality and prevent regressions.
Would you like assistance in setting up a testing framework and writing initial tests?
13-19
: Use consistent versioning for dependenciesFor consistency with other dependencies, consider using a caret range for
@types/node
.Apply this change:
- "@types/node": "22.5.4", + "@types/node": "^22.5.4",
20-29
: Dependencies look good, consider adding explanatory commentsThe selected dependencies align well with the project's goals. To improve maintainability, consider adding comments to explain the purpose of less common dependencies like
@ai-sdk/google
,@ai-sdk/openai
, and@t3-oss/env-core
.Example:
"dependencies": { // AI SDKs for Google and OpenAI integration "@ai-sdk/google": "^0.0.51", "@ai-sdk/openai": "^0.0.59", // ... // Type-safe environment variable handling "@t3-oss/env-core": "^0.11.1", // ... }apps/unsubscriber/src/env.ts (1)
5-11
: LGTM: Environment configuration is well-structured and secure.The environment configuration is well-defined and uses appropriate validation:
- NODE_ENV is correctly restricted to specific values.
- GOOGLE_GENERATIVE_AI_API_KEY is required and must be non-empty, which is good for security.
- CORS_ORIGIN is optional, allowing for flexibility in deployment.
Consider adding a minimum value check for PORT to ensure it's a valid port number:
- PORT: z.number().default(5000), + PORT: z.number().min(1).max(65535).default(5000),apps/unsubscriber/README.md (5)
20-44
: LGTM: Clear installation instructions with a minor suggestionThe installation section provides clear and comprehensive instructions. The inclusion of the Google AI API key link is particularly helpful.
Consider adding a note about the security implications of API keys in the
.env
file. For example:Note: Keep your `.env` file secure and never commit it to version control.🧰 Tools
🪛 Markdownlint
43-43: null
Bare URL used(MD034, no-bare-urls)
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
37-37: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
45-66
: LGTM: Clear usage instructions with a suggestion for improvementThe instructions for running the service and using the unsubscribe endpoint are clear and helpful. The curl example is a good addition.
Consider using a more realistic example for the unsubscribe URL in the curl command. For instance:
- "url": "https://example.com/unsubscribe" + "url": "https://newsletter.example.com/unsubscribe?user=johndoe@email.com&token=abc123"This would better illustrate a typical unsubscribe URL structure.
🧰 Tools
🪛 Markdownlint
51-51: null
Bare URL used(MD034, no-bare-urls)
48-48: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
67-70
: LGTM: API endpoints listed with suggestion for more detailThe API endpoints section correctly lists the available endpoints.
Consider expanding the endpoint descriptions to provide more context. For example:
- `GET /`: Health check endpoint. Returns a 200 OK status if the service is running. - `POST /unsubscribe`: Trigger the unsubscribe process. Accepts a JSON payload with a `url` field containing the unsubscribe link.This additional information would give users a clearer understanding of each endpoint's purpose and usage.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~69-~69: Loose punctuation mark.
Context: ...scribe URL. ## API Endpoints -GET /
: Health check endpoint - `POST /unsubscr...(UNLIKELY_OPENING_PUNCTUATION)
85-88
: LGTM: Helpful troubleshooting section with suggestion for expansionThe troubleshooting section provides useful guidance for common issues, particularly related to Playwright.
Consider expanding the troubleshooting section with more specific tips related to this service. For example:
- If you encounter "Access Denied" errors, ensure that your Google AI API key is correctly set in the `.env` file and has the necessary permissions. - If the unsubscribe process fails, try running the service with increased verbosity (e.g., `DEBUG=* pnpm start`) to get more detailed logs for debugging. - Ensure that your machine's date and time are correctly set, as this can affect API authentication.These additional tips would help users troubleshoot issues specific to this unsubscribe automation service.
24-24
: Address Markdown linting issuesThere are a few minor Markdown linting issues that should be addressed:
Wrap bare URLs in angle brackets to make them clickable:
- You can get a Google AI API Key here: https://aistudio.google.com/app/apikey + You can get a Google AI API Key here: <https://aistudio.google.com/app/apikey>- The server will start on http://localhost:5000 by default. + The server will start on <http://localhost:5000> by default.Specify languages for fenced code blocks to enable syntax highlighting:
- ``` + ```bash cd apps/unsubscribe-automation ```Apply this change to all fenced code blocks in the README.
Also applies to: 30-30, 37-37, 43-43, 48-48, 51-51, 76-76, 81-81
🧰 Tools
🪛 Markdownlint
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
apps/unsubscriber/src/main.ts (2)
80-141
: Well-implemented action performer with room for simplificationThe
performUnsubscribeActions
function is robust, with good practices like retry mechanisms and visibility checks. However, the retry logic could be simplified for better readability.Consider simplifying the retry logic:
for (const action of actions) { - let retries = 0; - while (retries < MAX_RETRIES) { + for (let retries = 0; retries < MAX_RETRIES; retries++) { try { // ... (existing action logic) break; // Success, exit retry loop } catch (error) { console.warn( `Failed to perform action: ${action.type} on ${action.selector}. Retry ${retries + 1}/${MAX_RETRIES}. Error: ${error}`, ); - retries++; - if (retries >= MAX_RETRIES) { + if (retries >= MAX_RETRIES - 1) { console.error( `Max retries reached for action: ${action.type} on ${action.selector}`, ); } } // ... (existing delay logic) } }This change simplifies the loop structure while maintaining the same functionality.
143-196
: Robust fallback unsubscribe implementation with potential for enhancementThe
performFallbackUnsubscribe
function is well-implemented with a comprehensive list of multilingual keywords and various selector strategies. The visibility check before clicking is a good practice.Consider adding a retry mechanism for individual selector attempts to handle potential transient issues:
for (const selector of allSelectors) { + const MAX_RETRIES = 3; + for (let retry = 0; retry < MAX_RETRIES; retry++) { try { // ... (existing element finding and clicking logic) + break; // Success, exit retry loop } catch (error) { - console.warn(`Error trying to click ${selector}:`, error); + console.warn(`Error trying to click ${selector} (attempt ${retry + 1}/${MAX_RETRIES}):`, error); + if (retry < MAX_RETRIES - 1) { + await page.waitForTimeout(1000); // Wait before retrying + } } + } }This enhancement could improve the robustness of the fallback strategy by handling transient issues.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (6)
- apps/unsubscriber/README.md (1 hunks)
- apps/unsubscriber/package.json (1 hunks)
- apps/unsubscriber/src/env.ts (1 hunks)
- apps/unsubscriber/src/main.ts (1 hunks)
- apps/unsubscriber/src/server.ts (1 hunks)
- apps/unsubscriber/tsconfig.json (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/unsubscriber/src/server.ts
🧰 Additional context used
🪛 LanguageTool
apps/unsubscriber/README.md
[uncategorized] ~69-~69: Loose punctuation mark.
Context: ...scribe URL. ## API Endpoints -GET /
: Health check endpoint - `POST /unsubscr...(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
apps/unsubscriber/README.md
43-43: null
Bare URL used(MD034, no-bare-urls)
51-51: null
Bare URL used(MD034, no-bare-urls)
24-24: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
30-30: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
37-37: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
48-48: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
76-76: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
81-81: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
🔇 Additional comments (6)
apps/unsubscriber/src/env.ts (2)
1-3
: LGTM: Imports are appropriate and complete.The imports cover all necessary libraries for environment variable management and validation. The inclusion of "dotenv/config" ensures that environment variables from a .env file are properly loaded.
13-33
: Excellent: Well-configured options with informative comments.The additional configuration options are well-chosen:
- Setting
runtimeEnv
toprocess.env
is appropriate for a Node.js application.- Enabling
emptyStringAsUndefined
is a good practice to handle empty environment variables correctly.The detailed comment explaining the purpose and benefits of
emptyStringAsUndefined
is particularly valuable. It provides clear reasoning for future developers and maintainers.apps/unsubscriber/README.md (2)
1-12
: LGTM: Clear and informative introductionThe introduction and features section provide a comprehensive overview of the Unsubscribe Automation Service, aligning well with the PR objectives. The content is clear and informative.
72-83
: LGTM: Clear development instructionsThe development section provides clear instructions for running the service in development mode and building TypeScript files. The use of pnpm is consistent with the project setup.
🧰 Tools
🪛 Markdownlint
76-76: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
81-81: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
apps/unsubscriber/src/main.ts (2)
8-19
: LGTM: Well-structured Zod schema for page analysisThe
pageAnalysisSchema
is well-defined using Zod, providing a clear structure for the expected AI analysis output. This will help ensure type safety and validation of the AI responses.
1-277
: Overall Assessment: Robust Implementation with Minor Improvements NeededThe
main.ts
file presents a well-structured and comprehensive solution for automating email unsubscriptions. It effectively combines AI analysis with fallback strategies and includes robust error handling. The code is generally well-organized and follows good practices.Key strengths:
- Use of AI for analyzing unsubscribe pages
- Fallback strategy with multilingual support
- Retry mechanisms for handling transient issues
- Proper error handling and logging
Areas for improvement:
- Potential PII leakage in AI response logging
- Simplification of retry logic in
performUnsubscribeActions
- Enhancement of confirmation detection logic
- Secure handling of screenshots to prevent sensitive data exposure
Overall, the implementation is solid, and addressing the identified issues will further improve its security and efficiency. Great job on creating a complex automation solution!
"compilerOptions": { | ||
"target": "es2018", | ||
"module": "es2015", | ||
"moduleResolution": "bundler", | ||
"lib": ["es2018", "dom"], | ||
"strict": true, | ||
"esModuleInterop": true, | ||
"skipLibCheck": true, | ||
"forceConsistentCasingInFileNames": true, | ||
"outDir": "./dist", | ||
"rootDir": "./src" | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider updating TypeScript compiler options for better compatibility and features.
The current configuration is good, but there are a few suggestions for improvement:
- Update the
target
andmodule
to more recent versions (e.g., ES2020 or ES2022) for better performance and features, if your deployment environment supports it. - The
"moduleResolution": "bundler"
is a newer option and might not be supported by all tools. Consider using"node"
if you encounter any compatibility issues. - Be cautious with
"skipLibCheck": true
as it can hide issues in declaration files. Consider setting it tofalse
if you want stricter type checking.
Here's a suggested update to the compiler options:
"compilerOptions": {
- "target": "es2018",
- "module": "es2015",
+ "target": "es2020",
+ "module": "es2020",
"moduleResolution": "bundler",
- "lib": ["es2018", "dom"],
+ "lib": ["es2020", "dom"],
"strict": true,
"esModuleInterop": true,
- "skipLibCheck": true,
+ "skipLibCheck": false,
"forceConsistentCasingInFileNames": true,
"outDir": "./dist",
"rootDir": "./src"
},
These changes will enable more recent JavaScript features and stricter type checking.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"compilerOptions": { | |
"target": "es2018", | |
"module": "es2015", | |
"moduleResolution": "bundler", | |
"lib": ["es2018", "dom"], | |
"strict": true, | |
"esModuleInterop": true, | |
"skipLibCheck": true, | |
"forceConsistentCasingInFileNames": true, | |
"outDir": "./dist", | |
"rootDir": "./src" | |
}, | |
"compilerOptions": { | |
"target": "es2020", | |
"module": "es2020", | |
"moduleResolution": "bundler", | |
"lib": ["es2020", "dom"], | |
"strict": true, | |
"esModuleInterop": true, | |
"skipLibCheck": false, | |
"forceConsistentCasingInFileNames": true, | |
"outDir": "./dist", | |
"rootDir": "./src" | |
}, |
apps/unsubscriber/README.md
Outdated
## Prerequisites | ||
|
||
- Node.js (v14 or later) | ||
- pnpm package manager | ||
- A Google AI API key | ||
- An OpenAI API key | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify OpenAI API key requirement
The prerequisites section mentions an OpenAI API key, but it's not included in the .env
setup instructions later in the README. This might confuse users.
Consider one of the following actions:
- If the OpenAI API key is required, add it to the
.env
setup instructions. - If it's not required, remove it from the prerequisites list.
- If it's optional, clarify this in the prerequisites section.
apps/unsubscriber/src/main.ts
Outdated
import { generateText } from "ai"; | ||
import { google } from "@ai-sdk/google"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Add missing dependencies to package.json
The packages ai
and @ai-sdk/google
are imported in apps/unsubscriber/src/main.ts
but are not listed in your package.json
dependencies. Please add these packages to ensure the imports work correctly.
🔗 Analysis chain
Verify AI library imports and configuration
The imports for generateText
and google
from "ai" and "@ai-sdk/google" respectively may need verification. Ensure these modules are correctly installed and configured in your project environment.
Run the following script to check the package.json for these dependencies:
If the result is false
, you may need to install these packages or update your import statements.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
jq '.dependencies | has("ai") and has("@ai-sdk/google")' package.json
Length of output: 77
apps/unsubscriber/src/main.ts
Outdated
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | ||
const maxContentLength = 20000; // Adjust based on API limitations | ||
let contentToAnalyze = pageContent; | ||
if (contentToAnalyze.length > maxContentLength) { | ||
console.warn( | ||
`Page content exceeds ${maxContentLength} characters. Truncating...`, | ||
); | ||
contentToAnalyze = contentToAnalyze.substring(0, maxContentLength); | ||
} | ||
|
||
const prompt = ` | ||
Analyze the following HTML content and determine the actions needed to unsubscribe from an email newsletter. | ||
Provide a JSON object with: | ||
1. An 'actions' array containing steps to unsubscribe. Each action should have: | ||
- 'type': Either 'click', 'fill', 'select', or 'submit' | ||
- 'selector': A CSS selector for the element. Use only standard CSS selectors (e.g., '#id', '.class', 'tag', '[attribute]'). | ||
- 'value': (Optional) For input fields. Omit this field if not applicable. | ||
2. A 'confirmationIndicator' string to verify success. This should be a CSS selector for an element that appears after successful unsubscription. If uncertain, set to null. | ||
|
||
Return ONLY the JSON object, without any markdown formatting, code blocks, or explanation. | ||
|
||
HTML Content: | ||
${contentToAnalyze} | ||
`; | ||
|
||
let analysisText: string; | ||
try { | ||
const analysisPromise = generateText({ | ||
model: google("gemini-1.5-flash"), | ||
prompt: prompt, | ||
}); | ||
|
||
// Timeout if AI takes too long to respond | ||
const timeoutPromise = new Promise((_, reject) => | ||
setTimeout(() => reject(new Error("AI analysis timeout")), 30000), | ||
); | ||
|
||
const { text } = (await Promise.race([ | ||
analysisPromise, | ||
timeoutPromise, | ||
])) as { text: string }; | ||
analysisText = text; | ||
} catch (error) { | ||
console.error("Error generating AI analysis:", error); | ||
throw new Error("Failed to generate AI analysis"); | ||
} | ||
|
||
try { | ||
// Remove any markdown code block indicators | ||
const cleanedText = analysisText.replace(/```json\n?|\n?```/g, "").trim(); | ||
const parsedAnalysis = JSON.parse(cleanedText); | ||
return pageAnalysisSchema.parse(parsedAnalysis); | ||
} catch (error) { | ||
console.error("Error parsing AI response:", error); | ||
console.error("Raw AI response:", analysisText); | ||
throw new Error("Failed to parse AI response"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well-implemented AI analysis function with room for improvement
The analyzePageWithAI
function is well-structured and includes good practices such as content truncation and timeout handling. However, there's a potential issue with logging the raw AI response.
To address the potential PII leakage, consider removing or sanitizing the logging of the raw AI response:
console.error("Error parsing AI response:", error);
-console.error("Raw AI response:", analysisText);
+// Log a sanitized version or remove this line
This change will help prevent potential exposure of sensitive information that may be included in the AI's output.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | |
const maxContentLength = 20000; // Adjust based on API limitations | |
let contentToAnalyze = pageContent; | |
if (contentToAnalyze.length > maxContentLength) { | |
console.warn( | |
`Page content exceeds ${maxContentLength} characters. Truncating...`, | |
); | |
contentToAnalyze = contentToAnalyze.substring(0, maxContentLength); | |
} | |
const prompt = ` | |
Analyze the following HTML content and determine the actions needed to unsubscribe from an email newsletter. | |
Provide a JSON object with: | |
1. An 'actions' array containing steps to unsubscribe. Each action should have: | |
- 'type': Either 'click', 'fill', 'select', or 'submit' | |
- 'selector': A CSS selector for the element. Use only standard CSS selectors (e.g., '#id', '.class', 'tag', '[attribute]'). | |
- 'value': (Optional) For input fields. Omit this field if not applicable. | |
2. A 'confirmationIndicator' string to verify success. This should be a CSS selector for an element that appears after successful unsubscription. If uncertain, set to null. | |
Return ONLY the JSON object, without any markdown formatting, code blocks, or explanation. | |
HTML Content: | |
${contentToAnalyze} | |
`; | |
let analysisText: string; | |
try { | |
const analysisPromise = generateText({ | |
model: google("gemini-1.5-flash"), | |
prompt: prompt, | |
}); | |
// Timeout if AI takes too long to respond | |
const timeoutPromise = new Promise((_, reject) => | |
setTimeout(() => reject(new Error("AI analysis timeout")), 30000), | |
); | |
const { text } = (await Promise.race([ | |
analysisPromise, | |
timeoutPromise, | |
])) as { text: string }; | |
analysisText = text; | |
} catch (error) { | |
console.error("Error generating AI analysis:", error); | |
throw new Error("Failed to generate AI analysis"); | |
} | |
try { | |
// Remove any markdown code block indicators | |
const cleanedText = analysisText.replace(/```json\n?|\n?```/g, "").trim(); | |
const parsedAnalysis = JSON.parse(cleanedText); | |
return pageAnalysisSchema.parse(parsedAnalysis); | |
} catch (error) { | |
console.error("Error parsing AI response:", error); | |
console.error("Raw AI response:", analysisText); | |
throw new Error("Failed to parse AI response"); | |
} | |
} | |
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | |
const maxContentLength = 20000; // Adjust based on API limitations | |
let contentToAnalyze = pageContent; | |
if (contentToAnalyze.length > maxContentLength) { | |
console.warn( | |
`Page content exceeds ${maxContentLength} characters. Truncating...`, | |
); | |
contentToAnalyze = contentToAnalyze.substring(0, maxContentLength); | |
} | |
const prompt = ` | |
Analyze the following HTML content and determine the actions needed to unsubscribe from an email newsletter. | |
Provide a JSON object with: | |
1. An 'actions' array containing steps to unsubscribe. Each action should have: | |
- 'type': Either 'click', 'fill', 'select', or 'submit' | |
- 'selector': A CSS selector for the element. Use only standard CSS selectors (e.g., '#id', '.class', 'tag', '[attribute]'). | |
- 'value': (Optional) For input fields. Omit this field if not applicable. | |
2. A 'confirmationIndicator' string to verify success. This should be a CSS selector for an element that appears after successful unsubscription. If uncertain, set to null. | |
Return ONLY the JSON object, without any markdown formatting, code blocks, or explanation. | |
HTML Content: | |
${contentToAnalyze} | |
`; | |
let analysisText: string; | |
try { | |
const analysisPromise = generateText({ | |
model: google("gemini-1.5-flash"), | |
prompt: prompt, | |
}); | |
// Timeout if AI takes too long to respond | |
const timeoutPromise = new Promise((_, reject) => | |
setTimeout(() => reject(new Error("AI analysis timeout")), 30000), | |
); | |
const { text } = (await Promise.race([ | |
analysisPromise, | |
timeoutPromise, | |
])) as { text: string }; | |
analysisText = text; | |
} catch (error) { | |
console.error("Error generating AI analysis:", error); | |
throw new Error("Failed to generate AI analysis"); | |
} | |
try { | |
// Remove any markdown code block indicators | |
const cleanedText = analysisText.replace(/```json\n?|\n?```/g, "").trim(); | |
const parsedAnalysis = JSON.parse(cleanedText); | |
return pageAnalysisSchema.parse(parsedAnalysis); | |
} catch (error) { | |
console.error("Error parsing AI response:", error); | |
// Log a sanitized version or remove this line | |
throw new Error("Failed to parse AI response"); | |
} | |
} |
apps/unsubscriber/src/main.ts
Outdated
export async function autoUnsubscribe(url: string): Promise<boolean> { | ||
// Validate URL | ||
try { | ||
new URL(url); | ||
} catch (err) { | ||
console.error("Invalid URL provided:", url); | ||
return false; | ||
} | ||
|
||
const isProduction = env.NODE_ENV === "production"; | ||
const browser = await chromium.launch({ headless: isProduction }); | ||
const page = await browser.newPage(); | ||
|
||
try { | ||
console.log(`Navigating to ${url}`); | ||
await page.goto(url, { timeout: 30000 }); | ||
const initialContent = await page.content(); | ||
|
||
const maxContentLength = 20000; // Adjust based on API limitations | ||
const truncatedContent = | ||
initialContent.length > maxContentLength | ||
? initialContent.substring(0, maxContentLength) | ||
: initialContent; | ||
|
||
const analysis = await analyzePageWithAI(truncatedContent); | ||
console.log("AI analysis result:", JSON.stringify(analysis, null, 2)); | ||
|
||
if (analysis.actions.length > 0) { | ||
await performUnsubscribeActions(page, analysis.actions); | ||
} else { | ||
console.log("No actions determined by AI. Attempting fallback strategy."); | ||
const fallbackSuccess = await performFallbackUnsubscribe(page); | ||
if (!fallbackSuccess) { | ||
console.log("Fallback strategy failed to find unsubscribe element."); | ||
return false; | ||
} | ||
} | ||
|
||
// Wait for any redirects or page loads to complete | ||
await page | ||
.waitForLoadState("networkidle", { timeout: 10000 }) | ||
.catch((error) => { | ||
console.warn( | ||
"Error waiting for network idle state after actions:", | ||
error, | ||
); | ||
}); | ||
|
||
// Check for confirmation | ||
const finalContent = await page.content(); | ||
const confirmationFound = analysis.confirmationIndicator | ||
? await page.$(analysis.confirmationIndicator).then(Boolean) | ||
: finalContent.toLowerCase().includes("unsubscribed") || | ||
finalContent.toLowerCase().includes("successfully"); | ||
|
||
if (confirmationFound) { | ||
console.log("Unsubscribe confirmation found."); | ||
return true; | ||
} | ||
|
||
console.log("Unsubscribe action performed, but confirmation not found."); | ||
// Only take screenshot if not in production | ||
if (!isProduction) { | ||
await page.screenshot({ | ||
path: "final-state-screenshot.png", | ||
fullPage: true, | ||
}); | ||
} | ||
return false; | ||
} catch (error) { | ||
console.error("Error during unsubscribe process:", error); | ||
// Only take screenshot if not in production | ||
if (!isProduction) { | ||
await page.screenshot({ path: "error-screenshot.png", fullPage: true }); | ||
} | ||
return false; | ||
} finally { | ||
await browser.close(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well-structured main function with areas for improvement
The autoUnsubscribe
function is well-implemented, covering URL validation, AI-driven unsubscription, fallback strategies, and error handling. However, there are two areas that could be improved:
- Confirmation detection logic
- Secure handling of screenshots
Enhance the confirmation detection logic to avoid potential false positives:
const confirmationFound = analysis.confirmationIndicator
? await page.$(analysis.confirmationIndicator).then(Boolean)
- : finalContent.toLowerCase().includes("unsubscribed") ||
- finalContent.toLowerCase().includes("successfully");
+ : /unsubscribed|you have been removed|successfully unsubscribed/i.test(finalContent);
This change uses a more specific regex pattern to detect unsubscription confirmation, reducing the chance of false positives.
Handle screenshots more securely to protect potentially sensitive information:
if (!isProduction) {
+ // Implement a function to mask sensitive areas before capturing screenshots
+ await maskSensitiveAreas(page);
await page.screenshot({
path: "final-state-screenshot.png",
fullPage: true,
});
}
Implement a maskSensitiveAreas
function that hides or blurs areas of the page that might contain sensitive information before taking screenshots. This helps prevent accidental capture of personal data in error reports.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
export async function autoUnsubscribe(url: string): Promise<boolean> { | |
// Validate URL | |
try { | |
new URL(url); | |
} catch (err) { | |
console.error("Invalid URL provided:", url); | |
return false; | |
} | |
const isProduction = env.NODE_ENV === "production"; | |
const browser = await chromium.launch({ headless: isProduction }); | |
const page = await browser.newPage(); | |
try { | |
console.log(`Navigating to ${url}`); | |
await page.goto(url, { timeout: 30000 }); | |
const initialContent = await page.content(); | |
const maxContentLength = 20000; // Adjust based on API limitations | |
const truncatedContent = | |
initialContent.length > maxContentLength | |
? initialContent.substring(0, maxContentLength) | |
: initialContent; | |
const analysis = await analyzePageWithAI(truncatedContent); | |
console.log("AI analysis result:", JSON.stringify(analysis, null, 2)); | |
if (analysis.actions.length > 0) { | |
await performUnsubscribeActions(page, analysis.actions); | |
} else { | |
console.log("No actions determined by AI. Attempting fallback strategy."); | |
const fallbackSuccess = await performFallbackUnsubscribe(page); | |
if (!fallbackSuccess) { | |
console.log("Fallback strategy failed to find unsubscribe element."); | |
return false; | |
} | |
} | |
// Wait for any redirects or page loads to complete | |
await page | |
.waitForLoadState("networkidle", { timeout: 10000 }) | |
.catch((error) => { | |
console.warn( | |
"Error waiting for network idle state after actions:", | |
error, | |
); | |
}); | |
// Check for confirmation | |
const finalContent = await page.content(); | |
const confirmationFound = analysis.confirmationIndicator | |
? await page.$(analysis.confirmationIndicator).then(Boolean) | |
: finalContent.toLowerCase().includes("unsubscribed") || | |
finalContent.toLowerCase().includes("successfully"); | |
if (confirmationFound) { | |
console.log("Unsubscribe confirmation found."); | |
return true; | |
} | |
console.log("Unsubscribe action performed, but confirmation not found."); | |
// Only take screenshot if not in production | |
if (!isProduction) { | |
await page.screenshot({ | |
path: "final-state-screenshot.png", | |
fullPage: true, | |
}); | |
} | |
return false; | |
} catch (error) { | |
console.error("Error during unsubscribe process:", error); | |
// Only take screenshot if not in production | |
if (!isProduction) { | |
await page.screenshot({ path: "error-screenshot.png", fullPage: true }); | |
} | |
return false; | |
} finally { | |
await browser.close(); | |
} | |
} | |
export async function autoUnsubscribe(url: string): Promise<boolean> { | |
// Validate URL | |
try { | |
new URL(url); | |
} catch (err) { | |
console.error("Invalid URL provided:", url); | |
return false; | |
} | |
const isProduction = env.NODE_ENV === "production"; | |
const browser = await chromium.launch({ headless: isProduction }); | |
const page = await browser.newPage(); | |
try { | |
console.log(`Navigating to ${url}`); | |
await page.goto(url, { timeout: 30000 }); | |
const initialContent = await page.content(); | |
const maxContentLength = 20000; // Adjust based on API limitations | |
const truncatedContent = | |
initialContent.length > maxContentLength | |
? initialContent.substring(0, maxContentLength) | |
: initialContent; | |
const analysis = await analyzePageWithAI(truncatedContent); | |
console.log("AI analysis result:", JSON.stringify(analysis, null, 2)); | |
if (analysis.actions.length > 0) { | |
await performUnsubscribeActions(page, analysis.actions); | |
} else { | |
console.log("No actions determined by AI. Attempting fallback strategy."); | |
const fallbackSuccess = await performFallbackUnsubscribe(page); | |
if (!fallbackSuccess) { | |
console.log("Fallback strategy failed to find unsubscribe element."); | |
return false; | |
} | |
} | |
// Wait for any redirects or page loads to complete | |
await page | |
.waitForLoadState("networkidle", { timeout: 10000 }) | |
.catch((error) => { | |
console.warn( | |
"Error waiting for network idle state after actions:", | |
error, | |
); | |
}); | |
// Check for confirmation | |
const finalContent = await page.content(); | |
const confirmationFound = analysis.confirmationIndicator | |
? await page.$(analysis.confirmationIndicator).then(Boolean) | |
: /unsubscribed|you have been removed|successfully unsubscribed/i.test(finalContent); | |
if (confirmationFound) { | |
console.log("Unsubscribe confirmation found."); | |
return true; | |
} | |
console.log("Unsubscribe action performed, but confirmation not found."); | |
// Only take screenshot if not in production | |
if (!isProduction) { | |
await page.screenshot({ | |
path: "final-state-screenshot.png", | |
fullPage: true, | |
}); | |
} | |
return false; | |
} catch (error) { | |
console.error("Error during unsubscribe process:", error); | |
// Only take screenshot if not in production | |
if (!isProduction) { | |
await page.screenshot({ path: "error-screenshot.png", fullPage: true }); | |
} | |
return false; | |
} finally { | |
await browser.close(); | |
} | |
} |
@madhav-relish I'm still going through this but looking good so far. Nice touch with the screenshot for errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
apps/unsubscriber/README.md (3)
42-42
: Improve URL formatting for better accessibilityConsider wrapping the Google AI API key URL in proper markdown link syntax for better formatting and accessibility.
Change the line to:
You can get a Google AI API Key [here](https://aistudio.google.com/app/apikey).🧰 Tools
🪛 Markdownlint
42-42: null
Bare URL used(MD034, no-bare-urls)
52-73
: LGTM: Clear instructions with minor formatting suggestionThe instructions for running the service and using the unsubscribe endpoint are clear and well-explained. The example curl command is helpful for users.
For consistency, consider formatting the localhost URL on line 58:
The server will start on `http://localhost:5000` by default.This change improves readability and consistency with code formatting throughout the document.
🧰 Tools
🪛 Markdownlint
58-58: null
Bare URL used(MD034, no-bare-urls)
55-55: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
79-90
: Improve code block formatting in the development sectionThe development instructions are clear and helpful. To enhance readability and enable syntax highlighting, consider specifying the language for the fenced code blocks:
- ``` + ```bash pnpm run dev
pnpm run buildThese changes will improve the visual presentation of the commands in rendered markdown. <details> <summary>🧰 Tools</summary> <details> <summary>🪛 Markdownlint</summary><blockquote> 83-83: null Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 88-88: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </blockquote></details> </details> </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used: CodeRabbit UI** **Review profile: CHILL** <details> <summary>📥 Commits</summary> Files that changed from the base of the PR and between 48b650f760fa584eef462a5a33e2e0670aecd654 and c931419200d53c83afd7a3eb1bd0a41a7c37d2b9. </details> <details> <summary>📒 Files selected for processing (2)</summary> * apps/unsubscriber/README.md (1 hunks) * apps/unsubscriber/src/server.ts (1 hunks) </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary> * apps/unsubscriber/src/server.ts </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🪛 LanguageTool</summary><blockquote> <details> <summary>apps/unsubscriber/README.md</summary><blockquote> [uncategorized] ~76-~76: Loose punctuation mark. Context: ...scribe URL. ## API Endpoints - `GET /`: Health check endpoint - `POST /unsubscr... (UNLIKELY_OPENING_PUNCTUATION) </blockquote></details> </blockquote></details> <details> <summary>🪛 Markdownlint</summary><blockquote> <details> <summary>apps/unsubscriber/README.md</summary><blockquote> 42-42: null Bare URL used (MD034, no-bare-urls) --- 58-58: null Bare URL used (MD034, no-bare-urls) --- 23-23: null Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 29-29: null Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 36-36: null Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 55-55: null Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 83-83: null Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 88-88: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </blockquote></details> </blockquote></details> </details> <details> <summary>🔇 Additional comments (4)</summary><blockquote> <details> <summary>apps/unsubscriber/README.md (4)</summary><blockquote> `1-12`: **LGTM: Clear introduction and comprehensive features list** The introduction provides a concise overview of the Unsubscribe Automation Service, and the features list effectively highlights the key aspects of the service. This section is well-written and informative. --- `13-18`: **LGTM: Clear prerequisites, OpenAI API key removed** The prerequisites section clearly lists the necessary components for running the service. The OpenAI API key mentioned in the past review comment has been removed, which aligns with the current implementation using Google AI. --- `74-78`: **LGTM: Clear API endpoints listing** The API endpoints section provides a concise and clear list of available endpoints. The formatting is correct and easy to read. <details> <summary>🧰 Tools</summary> <details> <summary>🪛 LanguageTool</summary><blockquote> [uncategorized] ~76-~76: Loose punctuation mark. Context: ...scribe URL. ## API Endpoints - `GET /`: Health check endpoint - `POST /unsubscr... (UNLIKELY_OPENING_PUNCTUATION) </blockquote></details> </details> --- `92-95`: **LGTM: Helpful troubleshooting guidance** The troubleshooting section provides valuable information for users who might encounter issues. The reference to the Playwright installation guide and the suggestion to check console output for error messages are particularly helpful. </blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (4)
apps/unsubscriber/src/llm.ts (1)
8-19
: Overall implementation looks good, with some suggestions for improvement.The
getModel
function is well-structured and correctly implements the multi-provider support. However, consider the following improvements:
- Add a default case to the switch statement to handle potential invalid providers:
default: throw new Error(`Unsupported provider: ${provider}`);
- Consider using environment variables for model identifiers to make updates easier:
case "google": return google(process.env.GOOGLE_MODEL_ID || "gemini-1.5-flash");
- Add error handling for provider-specific function calls:
try { return google(modelId); } catch (error) { console.error(`Error initializing Google model: ${error.message}`); throw error; }These changes will improve the robustness and maintainability of the function.
apps/unsubscriber/src/main.ts (3)
1-16
: LGTM! Consider grouping related imports and constants.The imports and constants are well-defined and appropriate for the functionality. To improve readability, consider grouping related imports and constants together.
You could group the imports like this:
import { chromium } from "playwright"; import type { Page, ElementHandle, Locator } from "playwright"; import { z } from "zod"; + import { generateText } from "ai"; -import { env } from "./env"; import { getModel } from "./llm"; +import { env } from "./env"; const MAX_CONTENT_LENGTH = 20_000; // Adjust based on API limitations const AI_TIMEOUT = 30_000; +// Playwright-related constants const NETWORK_IDLE_TIMEOUT = 10_000; const MAX_RETRIES = 3; // Limit retries to 3 to avoid infinite loops const RETRY_DELAY = 1_000; // Delay between retries in milliseconds const ACTION_TIMEOUT = 5_000; // Timeout for each action in milliseconds const ACTION_DELAY = 2_000; // Delay between actions in milliseconds
76-149
: Well-structured action execution with room for optimizationThe
performUnsubscribeActions
andperformAction
functions are well-implemented with good error handling and logging. The separation of concerns between these functions improves readability.Consider simplifying the retry logic in
performUnsubscribeActions
for better readability:- for (const action of actions) { - let retries = 0; - while (retries < MAX_RETRIES) { + for (const action of actions) { + for (let retries = 0; retries < MAX_RETRIES; retries++) { try { // ... (existing code) break; // Success, exit retry loop } catch (error) { // ... (existing code) - retries++; - if (retries >= MAX_RETRIES) { + if (retries === MAX_RETRIES - 1) { console.error( `Max retries reached for action: ${action.type} on ${action.selector}`, ); } else { await page.waitForTimeout(RETRY_DELAY); } } } // ... (existing code) }This change simplifies the loop structure and makes the retry logic more straightforward.
151-204
: Robust fallback strategy with potential for optimizationThe
performFallbackUnsubscribe
function is well-implemented with a comprehensive multilingual approach. The error handling and visibility checking are appropriately handled.Consider optimizing the selector generation logic for better maintainability:
- const generateSelectors = (keyword: string) => [ - `button:has-text("${keyword}")`, - `a:has-text("${keyword}")`, - `input[type="submit"][value*="${keyword}" i]`, - `:text("${keyword}")`, - `[aria-label*="${keyword}" i]`, - `[title*="${keyword}" i]`, - ]; + const selectorTemplates = [ + (keyword: string) => `button:has-text("${keyword}")`, + (keyword: string) => `a:has-text("${keyword}")`, + (keyword: string) => `input[type="submit"][value*="${keyword}" i]`, + (keyword: string) => `:text("${keyword}")`, + (keyword: string) => `[aria-label*="${keyword}" i]`, + (keyword: string) => `[title*="${keyword}" i]`, + ]; + + const generateSelectors = (keyword: string) => selectorTemplates.map(template => template(keyword));This change separates the selector templates from the generation logic, making it easier to maintain and extend in the future.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (3)
- apps/unsubscriber/package.json (1 hunks)
- apps/unsubscriber/src/llm.ts (1 hunks)
- apps/unsubscriber/src/main.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/unsubscriber/package.json
🧰 Additional context used
🔇 Additional comments (4)
apps/unsubscriber/src/llm.ts (2)
1-4
: LGTM: Import statements are correct and well-structured.The import statements correctly import the necessary functions from different AI SDKs, providing a good foundation for the multi-provider support in this file.
6-6
: LGTM: Type alias is well-defined and exported.The
LLMProvider
type alias is correctly defined as a union of string literals, representing the supported AI providers. This type definition will help ensure type safety when using thegetModel
function.apps/unsubscriber/src/main.ts (2)
17-28
: Well-structured schema and type definition.The Zod schema for page analysis results is well-defined, covering all necessary fields for unsubscribe actions. Deriving the TypeScript type from the Zod schema ensures consistency between runtime validation and static typing.
206-259
:⚠️ Potential issueWell-structured main function with areas for improvement
The
autoUnsubscribe
function is well-implemented, covering URL validation, AI-driven unsubscription, fallback strategies, and error handling. However, there's an area that could be improved:Enhance the confirmation detection logic to avoid potential false positives:
const confirmationFound = analysis.confirmationIndicator ? await page.$(analysis.confirmationIndicator).then(Boolean) - : finalContent.toLowerCase().includes("unsubscribed") || - finalContent.toLowerCase().includes("successfully"); + : /unsubscribed|you have been removed|successfully unsubscribed/i.test(finalContent);This change uses a more specific regex pattern to detect unsubscription confirmation, reducing the chance of false positives.
Likely invalid or redundant comment.
apps/unsubscriber/src/main.ts
Outdated
if (confirmationFound) { | ||
console.log("Unsubscribe confirmation found."); | ||
return true; | ||
} | ||
|
||
console.log("Unsubscribe action performed, but confirmation not found."); | ||
// Only take screenshot if not in production | ||
if (!isProduction) { | ||
await page.screenshot({ | ||
path: "final-state-screenshot.png", | ||
fullPage: true, | ||
}); | ||
} | ||
return false; | ||
} catch (error) { | ||
console.error("Error during unsubscribe process:", error); | ||
// Only take screenshot if not in production | ||
if (!isProduction) { | ||
await page.screenshot({ path: "error-screenshot.png", fullPage: true }); | ||
} | ||
return false; | ||
} finally { | ||
await browser.close(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve security measures for error screenshots
The error handling and logging in the autoUnsubscribe
function are comprehensive. However, there's a potential security risk with capturing screenshots that may contain sensitive information.
Consider implementing a function to mask sensitive areas before capturing screenshots:
if (!isProduction) {
+ await maskSensitiveAreas(page);
await page.screenshot({
path: "final-state-screenshot.png",
fullPage: true,
});
}
Implement a maskSensitiveAreas
function that hides or blurs areas of the page that might contain sensitive information before taking screenshots. This helps prevent accidental capture of personal data in error reports.
Would you like assistance in implementing the maskSensitiveAreas
function?
Committable suggestion was skipped due to low confidence.
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | ||
const contentToAnalyze = pageContent.slice(0, MAX_CONTENT_LENGTH); | ||
if (contentToAnalyze.length < pageContent.length) { | ||
console.warn( | ||
`Page content exceeds ${MAX_CONTENT_LENGTH} characters. Truncated.`, | ||
); | ||
} | ||
|
||
const model = getModel("google"); | ||
|
||
const prompt = ` | ||
Analyze the following HTML content and determine the actions needed to unsubscribe from an email newsletter. | ||
Provide a JSON object with: | ||
1. An 'actions' array containing steps to unsubscribe. Each action should have: | ||
- 'type': Either 'click', 'fill', 'select', or 'submit' | ||
- 'selector': A CSS selector for the element. Use only standard CSS selectors (e.g., '#id', '.class', 'tag', '[attribute]'). | ||
- 'value': (Optional) For input fields. Omit this field if not applicable. | ||
2. A 'confirmationIndicator' string to verify success. This should be a CSS selector for an element that appears after successful unsubscription. If uncertain, set to null. | ||
|
||
Return ONLY the JSON object, without any markdown formatting, code blocks, or explanation. | ||
|
||
HTML Content: | ||
${contentToAnalyze} | ||
`; | ||
|
||
try { | ||
const { text: analysisText } = await Promise.race([ | ||
generateText({ model, prompt }), | ||
new Promise<never>((_, reject) => | ||
setTimeout(() => reject(new Error("AI analysis timeout")), AI_TIMEOUT), | ||
), | ||
]); | ||
|
||
const cleanedText = analysisText.replace(/```json\n?|\n?```/g, "").trim(); | ||
const parsedAnalysis = JSON.parse(cleanedText); | ||
return pageAnalysisSchema.parse(parsedAnalysis); | ||
} catch (error) { | ||
console.error("Error in AI analysis:", error); | ||
console.error( | ||
"Raw AI response:", | ||
error instanceof Error ? error.message : String(error), | ||
); | ||
throw new Error("Failed to generate or parse AI analysis"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well-implemented AI analysis function with room for improvement
The analyzePageWithAI
function is well-structured and includes good practices such as content truncation and timeout handling. However, there's a potential issue with logging the raw AI response.
To address the potential PII leakage, consider removing or sanitizing the logging of the raw AI response:
console.error("Error in AI analysis:", error);
-console.error(
- "Raw AI response:",
- error instanceof Error ? error.message : String(error),
-);
+console.error("AI analysis error details:", error instanceof Error ? error.message : String(error));
This change will help prevent potential exposure of sensitive information that may be included in the AI's output.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | |
const contentToAnalyze = pageContent.slice(0, MAX_CONTENT_LENGTH); | |
if (contentToAnalyze.length < pageContent.length) { | |
console.warn( | |
`Page content exceeds ${MAX_CONTENT_LENGTH} characters. Truncated.`, | |
); | |
} | |
const model = getModel("google"); | |
const prompt = ` | |
Analyze the following HTML content and determine the actions needed to unsubscribe from an email newsletter. | |
Provide a JSON object with: | |
1. An 'actions' array containing steps to unsubscribe. Each action should have: | |
- 'type': Either 'click', 'fill', 'select', or 'submit' | |
- 'selector': A CSS selector for the element. Use only standard CSS selectors (e.g., '#id', '.class', 'tag', '[attribute]'). | |
- 'value': (Optional) For input fields. Omit this field if not applicable. | |
2. A 'confirmationIndicator' string to verify success. This should be a CSS selector for an element that appears after successful unsubscription. If uncertain, set to null. | |
Return ONLY the JSON object, without any markdown formatting, code blocks, or explanation. | |
HTML Content: | |
${contentToAnalyze} | |
`; | |
try { | |
const { text: analysisText } = await Promise.race([ | |
generateText({ model, prompt }), | |
new Promise<never>((_, reject) => | |
setTimeout(() => reject(new Error("AI analysis timeout")), AI_TIMEOUT), | |
), | |
]); | |
const cleanedText = analysisText.replace(/```json\n?|\n?```/g, "").trim(); | |
const parsedAnalysis = JSON.parse(cleanedText); | |
return pageAnalysisSchema.parse(parsedAnalysis); | |
} catch (error) { | |
console.error("Error in AI analysis:", error); | |
console.error( | |
"Raw AI response:", | |
error instanceof Error ? error.message : String(error), | |
); | |
throw new Error("Failed to generate or parse AI analysis"); | |
} | |
} | |
async function analyzePageWithAI(pageContent: string): Promise<PageAnalysis> { | |
const contentToAnalyze = pageContent.slice(0, MAX_CONTENT_LENGTH); | |
if (contentToAnalyze.length < pageContent.length) { | |
console.warn( | |
`Page content exceeds ${MAX_CONTENT_LENGTH} characters. Truncated.`, | |
); | |
} | |
const model = getModel("google"); | |
const prompt = ` | |
Analyze the following HTML content and determine the actions needed to unsubscribe from an email newsletter. | |
Provide a JSON object with: | |
1. An 'actions' array containing steps to unsubscribe. Each action should have: | |
- 'type': Either 'click', 'fill', 'select', or 'submit' | |
- 'selector': A CSS selector for the element. Use only standard CSS selectors (e.g., '#id', '.class', 'tag', '[attribute]'). | |
- 'value': (Optional) For input fields. Omit this field if not applicable. | |
2. A 'confirmationIndicator' string to verify success. This should be a CSS selector for an element that appears after successful unsubscription. If uncertain, set to null. | |
Return ONLY the JSON object, without any markdown formatting, code blocks, or explanation. | |
HTML Content: | |
${contentToAnalyze} | |
`; | |
try { | |
const { text: analysisText } = await Promise.race([ | |
generateText({ model, prompt }), | |
new Promise<never>((_, reject) => | |
setTimeout(() => reject(new Error("AI analysis timeout")), AI_TIMEOUT), | |
), | |
]); | |
const cleanedText = analysisText.replace(/```json\n?|\n?```/g, "").trim(); | |
const parsedAnalysis = JSON.parse(cleanedText); | |
return pageAnalysisSchema.parse(parsedAnalysis); | |
} catch (error) { | |
console.error("Error in AI analysis:", error); | |
console.error("AI analysis error details:", error instanceof Error ? error.message : String(error)); | |
throw new Error("Failed to generate or parse AI analysis"); | |
} | |
} |
@madhav-relish thanks for the PR! Most of the changes I made were just Cursor going over each section of code. One potentially big improvement would be to try click the button without using AI at all, with something like this: await page.getByText('Unsubscribe').click() And then for the different variations of this for different text that's common in unsub flows / different languages. The next step for this would be to integrate it with the core app. |
Description
Develop a Playwright-based application that automatically navigates and completes unsubscribe processes for email subscriptions, using an LLM to interpret pages and guide actions.
Changes
unsubscribe-automation
serviceautoUnsubscribe
function using Playwright for web automationNew Files
src/main.ts
: Core logic for unsubscribe automationsrc/server.ts
: Fastify server setup and API endpointpackage.json
: Project dependencies and scriptstsconfig.json
: TypeScript configuration.env
: Environment variables (not committed, added to .gitignore)Dependencies Added
playwright
: For web automation@google/generative-ai
: For AI-powered page analysisfastify
: For creating the API server@fastify/cors
: For CORS supportzod
: For input validationdotenv
: For environment variable managementNext Steps
Fixes #230
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Configuration
package.json
file detailing project metadata and scripts.Documentation