-
Notifications
You must be signed in to change notification settings - Fork 0
A new branch that implements a more modular design with improved performance and security checks #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ormance and security checks
|
|
||
| if __name__ == "__main__": | ||
| # For standalone run (dev) | ||
| app.run(port=8050, debug=True) |
Check failure
Code scanning / CodeQL
Flask app is run in debug mode High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 13 days ago
In general, Flask applications should not hard-code debug=True in app.run for code that might ever be run outside a strictly controlled dev context. Instead, debug mode should be disabled by default, and only enabled via an explicit and clearly development-only mechanism (e.g., an environment variable or command-line flag), or simply omitted so that developers can override it when needed.
The single best fix here, without changing core functionality, is to remove debug=True from the app.run call and, if desired, make debug optionally controllable via an environment variable. Since we must not assume extra configuration, the minimal and safe change is to run the app without debug mode: change line 130 from app.run(port=8050, debug=True) to app.run(port=8050). This preserves the ability to run the app standalone on port 8050 and eliminates the embedded debug server. If developers need debug mode, they can still enable it externally (e.g., using FLASK_ENV=development flask run), or by temporarily editing this line in their local copy.
No additional imports or helper methods are required for this minimal fix; all changes are confined to the if __name__ == "__main__": block at the end of programs/mspp_web/backend/app.py.
-
Copy modified line R130
| @@ -127,4 +127,4 @@ | ||
|
|
||
| if __name__ == "__main__": | ||
| # For standalone run (dev) | ||
| app.run(port=8050, debug=True) | ||
| app.run(port=8050) |
| return jsonify({'image': fig_to_base64(fig)}) | ||
| except Exception as e: | ||
| logging.exception(f"Plot generation failed: {e}") | ||
| return jsonify({'error': str(e)}), 500 |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 13 days ago
In general, to fix information exposure through exceptions, avoid returning raw exception messages or stack traces to clients. Instead, log the detailed error on the server and return a generic, non-sensitive message such as “An internal error occurred” along with an appropriate HTTP status code. If you need to convey user-correctable issues, create explicit, sanitized error messages rather than reusing exception text.
For this specific file, the best minimal fix without changing existing functionality is:
- Keep the
logging.exception(...)calls as-is to preserve detailed diagnostics on the server. - Change the
jsonify({'error': str(e)})responses in bothgenerate_plotandexport_plotto return a fixed, generic error message. For example:- In
generate_plot:{'error': 'Failed to generate plot. Please try again later.'} - In
export_plot:{'error': 'Failed to export plot. Please try again later.'}
- In
- Keep the HTTP status code
500unchanged, preserving client-visible semantics (still an internal server error) but without leaking exception contents.
No new imports or helper methods are strictly needed; we only adjust the JSON payloads in the existing except blocks.
-
Copy modified line R102 -
Copy modified line R126
| @@ -99,7 +99,7 @@ | ||
| return jsonify({'image': fig_to_base64(fig)}) | ||
| except Exception as e: | ||
| logging.exception(f"Plot generation failed: {e}") | ||
| return jsonify({'error': str(e)}), 500 | ||
| return jsonify({'error': 'Failed to generate plot. Please try again later.'}), 500 | ||
|
|
||
| @app.route('/api/export/<chart_type>', methods=['POST']) | ||
| def export_plot(chart_type): | ||
| @@ -123,7 +123,7 @@ | ||
| return send_file(buf, mimetype='image/png', as_attachment=True, download_name=name) | ||
| except Exception as e: | ||
| logging.exception(f"Export failed: {e}") | ||
| return jsonify({'error': str(e)}), 500 | ||
| return jsonify({'error': 'Failed to export plot. Please try again later.'}), 500 | ||
|
|
||
| if __name__ == "__main__": | ||
| # For standalone run (dev) |
| return send_file(buf, mimetype='image/png', as_attachment=True, download_name=name) | ||
| except Exception as e: | ||
| logging.exception(f"Export failed: {e}") | ||
| return jsonify({'error': str(e)}), 500 |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 13 days ago
In general, the fix is to avoid sending raw exception messages to the client. Instead, log the full exception (including stack trace) on the server, and return a generic, non-sensitive error message in the HTTP response.
In this file, there are two similar patterns: in generate_plot (lines 100–102) and in export_plot (lines 124–126). To fix the specific alert on line 126 without changing functionality, we should modify the response body to use a generic error string like "Internal server error" or a domain-specific but non-detailed message, e.g. "Failed to export plot". The logging should remain unchanged so developers still have detailed diagnostics. The HTTP status code (500) should remain as is.
Concretely:
- In
export_plot, keeplogging.exception(f"Export failed: {e}")as-is. - Replace
return jsonify({'error': str(e)}), 500with a generic message, for examplereturn jsonify({'error': 'Failed to export plot'}), 500. - (Optionally, for consistency and to prevent similar issues, you might apply the same pattern to
generate_plot, but the CodeQL alert is specifically for line 126; we will only change that line.)
No new methods, imports, or definitions are required; we only change the error message literal.
-
Copy modified line R126
| @@ -123,7 +123,7 @@ | ||
| return send_file(buf, mimetype='image/png', as_attachment=True, download_name=name) | ||
| except Exception as e: | ||
| logging.exception(f"Export failed: {e}") | ||
| return jsonify({'error': str(e)}), 500 | ||
| return jsonify({'error': 'Failed to export plot'}), 500 | ||
|
|
||
| if __name__ == "__main__": | ||
| # For standalone run (dev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the MSPP web backend to implement a more modular design by separating business logic from API routing. The main computation and visualization code is moved to a new logic.py file, while app.py is streamlined to handle only HTTP request/response concerns.
Changes:
- Created
logic.pymodule containingDataProcessorandPlotGeneratorclasses with optimized data loading - Refactored
app.pyto use the extracted logic module and consolidated routing with parameterized endpoints - Added security improvements including
secure_filenamefor file uploads
Reviewed changes
Copilot reviewed 2 out of 4 changed files in this pull request and generated 17 comments.
| File | Description |
|---|---|
programs/mspp_web/backend/logic.py |
New file containing core data processing and plotting logic extracted from app.py |
programs/mspp_web/backend/app.py |
Refactored to use logic module, consolidated routes, added security improvements |
programs/mspp_web/backend/__pycache__/app.cpython-314.pyc |
Compiled Python bytecode targeting Python 3.14 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.file_to_raw_column = {} | ||
| self.cached_data = None | ||
| self.cached_file_list = [] | ||
|
|
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DataProcessor class is missing a clear_cache method that was present in the original implementation and is called in app.py line 82. This will cause an AttributeError when the DELETE endpoint is called.
| def clear_cache(self): | |
| """Clear all cached data and reset cache-related state.""" | |
| self.file_to_raw_column.clear() | |
| self.cached_data = None | |
| self.cached_file_list = [] |
| }), 500 | ||
| return send_file(buf, mimetype='image/png', as_attachment=True, download_name=name) | ||
| except Exception as e: | ||
| logging.exception(f"Export failed: {e}") |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error message context when logging exceptions. The f-string embeds the exception object directly in the message string, but using logging.exception() already includes the traceback. The exception should either be logged separately or the message should provide more context about what operation failed.
| if chart_type == 'bar-chart': | ||
| fig = plotter.create_bar_chart_figure(data, figsize=(10, 6)) | ||
| name = 'protein_id_bar_chart.png' | ||
| elif chart_type == 'sample-comparison': | ||
| fig = plotter.create_comparison_figure(data, figsize=(18, 16)) | ||
| name = 'intensity_ratio_comparison.png' | ||
| else: | ||
| return jsonify({'error': 'Invalid plot type'}), 400 | ||
|
|
||
| # Generate high-DPI plot using reusable method | ||
| fig = plotter.create_comparison_figure(data, figsize=(18, 16)) | ||
|
|
||
| # Save to bytes buffer with high DPI | ||
| buf = io.BytesIO() | ||
| fig.savefig(buf, format='png', dpi=300, bbox_inches='tight') | ||
| buf.seek(0) | ||
| plt.close(fig) | ||
|
|
||
| return send_file(buf, mimetype='image/png', as_attachment=True, download_name='intensity_ratio_comparison.png') | ||
| except Exception: | ||
| logging.exception("Unexpected error while exporting sample comparison") | ||
| return jsonify({ | ||
| 'error': 'An internal error occurred while exporting the sample comparison plot.' | ||
| }), 500 | ||
|
|
||
|
|
||
| @app.route('/api/export/all', methods=['POST']) | ||
| def export_all_plots(): | ||
| """Export all plots as a ZIP file.""" | ||
| if not uploaded_files: | ||
| return jsonify({'error': 'No files uploaded'}), 400 | ||
|
|
||
| try: | ||
| import zipfile | ||
|
|
||
| from flask import send_file | ||
|
|
||
| data = processor.load_data(list(uploaded_files.values())) | ||
|
|
||
| # Create ZIP file in memory | ||
| zip_buffer = io.BytesIO() | ||
|
|
||
| with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zip_file: | ||
| # Generate and add bar chart using reusable method | ||
| fig = plotter.create_bar_chart_figure(data, figsize=(10, 6)) | ||
| bar_buf = io.BytesIO() | ||
| fig.savefig(bar_buf, format='png', dpi=300, bbox_inches='tight') | ||
| bar_buf.seek(0) | ||
| plt.close(fig) | ||
| zip_file.writestr('protein_id_bar_chart.png', bar_buf.getvalue()) | ||
|
|
||
| # Generate and add sample comparison plot using reusable method | ||
| fig = plotter.create_comparison_figure(data, figsize=(18, 16)) | ||
| comp_buf = io.BytesIO() | ||
| fig.savefig(comp_buf, format='png', dpi=300, bbox_inches='tight') | ||
| comp_buf.seek(0) | ||
| plt.close(fig) | ||
| zip_file.writestr('intensity_ratio_comparison.png', comp_buf.getvalue()) | ||
|
|
||
| zip_buffer.seek(0) | ||
| return send_file( | ||
| zip_buffer, | ||
| mimetype='application/zip', | ||
| as_attachment=True, | ||
| download_name='mspp_plots.zip' | ||
| ) | ||
| except Exception: | ||
| logging.exception("Unexpected error while exporting all plots") | ||
| return jsonify({ | ||
| 'error': 'An internal error occurred while exporting plots.' | ||
| }), 500 | ||
| return send_file(buf, mimetype='image/png', as_attachment=True, download_name=name) |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential resource leak: The figure object is not explicitly closed after export. If an exception occurs after fig.savefig() but before the function returns, the figure resource won't be cleaned up properly. Consider wrapping in a try-finally block or using a context manager.
| if self.cached_data is not None and self.cached_file_list == file_paths: | ||
| return self.cached_data | ||
|
|
||
| all_data = [] | ||
| self.file_to_raw_column = {} |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache invalidation issue: When cached data is returned (line 62), the file_to_raw_column dictionary is not repopulated. This means if the cache is used, self.file_to_raw_column will be empty (cleared at line 65 on the previous non-cached load), causing subsequent operations that depend on this mapping (like get_pk() in calculate_sample_comparison_data or get_sort_val() in create_bar_chart_figure) to fail or produce incorrect results.
| # Permissive CSP to allow local execution even with strict Group Policy | ||
| # "unsafe-eval" is often needed for dev builds or specific React setups | ||
| # Strict but functional CSP | ||
| response.headers['Content-Security-Policy'] = "default-src 'self' 'unsafe-inline' 'unsafe-eval' data: blob:;" |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Content Security Policy includes 'unsafe-eval' which is a security risk. While the comment mentions this may be needed for dev builds or React setups, it allows arbitrary JavaScript execution via eval() and related functions, which significantly weakens the CSP protection. Consider using a more restrictive CSP for production environments or investigating if 'unsafe-eval' is actually required for your React build.
| response.headers['Content-Security-Policy'] = "default-src 'self' 'unsafe-inline' 'unsafe-eval' data: blob:;" | |
| # Only allow 'unsafe-eval' in development/debug mode if tooling requires it. | |
| default_src = ["'self'", "'unsafe-inline'", "data:", "blob:"] | |
| if app.debug: | |
| default_src.append("'unsafe-eval'") | |
| response.headers['Content-Security-Policy'] = "default-src " + " ".join(default_src) |
|
|
||
| return jsonify({'image': fig_to_base64(fig)}) | ||
| except Exception as e: | ||
| logging.exception(f"Plot generation failed: {e}") |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error message context when logging exceptions. The f-string embeds the exception object directly in the message string, but using logging.exception() already includes the traceback. The exception should either be logged separately or the message should provide more context about what operation failed.
| if chart_type == 'bar-chart': | ||
| fig = plotter.create_bar_chart_figure(data) | ||
| elif chart_type == 'sample-comparison': | ||
| fig = plotter.create_comparison_figure(data) | ||
| else: | ||
| return jsonify({'error': 'Invalid plot type'}), 400 | ||
|
|
||
| return jsonify({'image': fig_to_base64(fig)}) |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential resource leak: The figure object is not explicitly closed after being converted to base64. While fig_to_base64 closes the figure at line 26, if an exception occurs during the conversion before plt.close() is called, the figure resource won't be cleaned up properly.
| logging.warning(f"Fast load failed for {filepath}, falling back to full load: {e}") | ||
| df = pd.read_csv(filepath, sep="\t", low_memory=False) | ||
| raw_cols = [c for c in df.columns if ".raw" in c.lower()] | ||
| prot_col = next((c for c in ["Protein.Names", "Protein.Group"] if c in df.columns), None) |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback error handling at line 85 searches for "Protein.Group" but not "Protein.Ids" which was included in the primary search at line 74. This inconsistency means if the fast load fails and falls back, it may not find protein columns that would have been found in the primary path, potentially causing different behavior.
| prot_col = next((c for c in ["Protein.Names", "Protein.Group"] if c in df.columns), None) | |
| prot_col = next((c for c in ["Protein.Names", "Protein.Group", "Protein.Ids"] if c in df.columns), None) or \ | |
| next((c for c in df.columns if "protein" in c.lower()), None) |
| for f in sample_files: | ||
| if re.search(r'E[-_]?25|Y[-_]?150', f.upper()): e25_exp.append(f) | ||
| elif re.search(r'E[-_]?100|Y[-_]?75', f.upper()): e100_exp.append(f) | ||
|
|
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing validation before pairing. If e25_exp or e100_exp lists are empty, the code doesn't raise an error at the detection stage, and will proceed to the pairing logic. If both lists are empty, strict_pairs will be empty, triggering the fallback to index-based pairing which will create an empty sample_pairs list (since zip of two empty lists is empty), eventually raising "No valid sample pairs found" at line 202. However, a clearer error message should be raised earlier if E25/E100 samples cannot be detected from filenames.
| # Ensure that both E25-like and E100-like samples were detected before pairing | |
| if not e25_exp or not e100_exp: | |
| raise ValueError( | |
| "Could not detect both E25 and E100 samples from filenames; " | |
| f"found {len(e25_exp)} E25-like and {len(e100_exp)} E100-like samples." | |
| ) |
| @@ -0,0 +1,296 @@ | |||
| """ | |||
| MSPP logic.py - Data Processing and Plot Generation | |||
| Contains the core analytical algorithms and visualization logic." | |||
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mismatched quotation marks in docstring. The opening uses double quotes while the closing uses a single double quote instead of triple quotes.
| Contains the core analytical algorithms and visualization logic." | |
| Contains the core analytical algorithms and visualization logic. |
JonathanTo99
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge modular_test into main
No description provided.